1
Commit Graph

3023 Commits

Author SHA1 Message Date
Rafael J. Wysocki
4526c58109 thermal: gov_bang_bang: Clean up thermal_zone_trip_update()
Do the following cleanups in thermal_zone_trip_update():

 * Drop the useless "zero hysteresis" message.
 * Eliminate the trip_index local variable that is redundant.
 * Drop 2 comments that are not useful.
 * Downgrade a diagnostic message from pr_warn() to pr_debug().
 * Use consistent field formatting in diagnostic messages.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-04-23 20:38:06 +02:00
Rafael J. Wysocki
530c932bdf thermal: gov_bang_bang: Use .trip_crossed() instead of .throttle()
The Bang-Bang governor really is only concerned about trip point
crossing, so it can use the new .trip_crossed() callback instead of
.throttle() that is not particularly suitable for it.

Modify it to do so which also takes trip hysteresis into account, so the
governor does not need to use it directly any more.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-04-23 20:37:41 +02:00
Greg Kroah-Hartman
e5019b1423 Merge 6.9-rc5 into driver-core-next
We want the kernfs fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-23 13:27:43 +02:00
Binbin Zhou
734b5def91 thermal/drivers/loongson2: Add Loongson-2K2000 support
The Loongson-2K2000 and Loongson-2K1000 have similar thermal sensors,
except that the temperature is read differently.

In particular, the temperature output registers of the Loongson-2K2000
are defined in the chip configuration domain and are read in a different
way.

Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Acked-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/fdbfdcc3231a36a4ee0bcf1377ddcbd6f8c944b5.1713837379.git.zhoubinbin@loongson.cn
2024-04-23 12:40:30 +02:00
Binbin Zhou
6ef6d5ff5e thermal/drivers/loongson2: Trivial code style adjustment
Here are some minor code style adjustment. Such as fix whitespace code
style; align function call arguments to opening parenthesis.

Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Acked-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/ccca50f2ad3fd8c84fcbfcb1f875427ea7f637a0.1713837379.git.zhoubinbin@loongson.cn
2024-04-23 12:40:30 +02:00
Nicolas Pitre
f4745f546e thermal/drivers/mediatek/lvts_thermal: Add MT8188 support
Various values extracted from the vendor's kernel driver.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240402032729.2736685-15-nico@fluxnic.net
2024-04-23 12:40:30 +02:00
Nicolas Pitre
11e6f4c314 thermal/drivers/mediatek/lvts_thermal: Allow early empty sensor slots
Some systems don't always populate sensor controller slots starting
at slot 0. Use a bitmap instead of a count to indicate valid sensor
slots. Also create a pretty iterator for that.

About that iterator: it causes checkpatch to complain with "ERROR:
Macros with multiple statements should be enclosed in a do - while
loop". However this is not possible here. And many similar iterators
do exist using the same form in the tree already.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240402032729.2736685-12-nico@fluxnic.net
2024-04-23 12:40:30 +02:00
Nicolas Pitre
684cbb49f9 thermal/drivers/mediatek/lvts_thermal: Provision for gt variable location
The golden temperature calibration value in nvram is not always the
3rd byte. A future commit will prove this assumption wrong.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240402032729.2736685-11-nico@fluxnic.net
2024-04-23 12:40:30 +02:00
Nicolas Pitre
a4c1ab2f4c thermal/drivers/mediatek/lvts_thermal: Add MT8186 support
Various values extracted from the vendor's kernel driver.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240402032729.2736685-9-nico@fluxnic.net
2024-04-23 12:40:30 +02:00
Nicolas Pitre
2cc0b1a216 thermal/drivers/mediatek/lvts_thermal: Guard against efuse data buffer overflow
We don't want to silently fetch garbage past the actual buffer.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240402032729.2736685-6-nico@fluxnic.net
2024-04-23 12:40:29 +02:00
Nicolas Pitre
5b3367e28a thermal/drivers/mediatek/lvts_thermal: Use offsets for every calibration byte
Current code assumes calibration values are always stored contiguously
in host endian order. A future patch will prove this wrong.

Let's specify the offset for each calibration byte instead.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240402032729.2736685-5-nico@fluxnic.net
2024-04-23 12:40:29 +02:00
Nicolas Pitre
554bca3130 thermal/drivers/mediatek/lvts_thermal: Remove .hw_tshut_temp
All the .hw_tshut_temp instances are initialized with the same value.
Let's remove those and use a common definition instead. If ever a
different value must be used in the future then an override parameter
could be added back.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240402032729.2736685-4-nico@fluxnic.net
2024-04-23 12:40:29 +02:00
Nicolas Pitre
62194e637d thermal/drivers/mediatek/lvts_thermal: Move comment
Move efuse data interpretation inside lvts_golden_temp_init() alongside
the actual code retrieving wanted value.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240402032729.2736685-3-nico@fluxnic.net
2024-04-23 12:40:29 +02:00
Nicolas Pitre
8c25958f71 thermal/drivers/mediatek/lvts_thermal: Retrieve all calibration bytes
Calibration values are 24-bit wide. Those values so far appear to span
only 16 bits but let's not push our luck.

Found while looking at the original Mediatek driver code.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240402032729.2736685-2-nico@fluxnic.net
2024-04-23 12:40:29 +02:00
Christophe JAILLET
61fad0a906 thermal/drivers/k3_bandgap: Remove some unused fields in struct k3_bandgap
In "struct k3_bandgap", the 'conf' field is unused.
Remove it.

Found with cppcheck, unusedStructMember.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/206d39bed9ec6df9b4d80b1fc064e499389fc7fc.1712687420.git.christophe.jaillet@wanadoo.fr
2024-04-23 12:40:29 +02:00
Christophe JAILLET
0a0c8db884 thermal/drivers/qcom: Remove some unused fields in struct qpnp_tm_chip
In "struct qpnp_tm_chip", the 'prev_stage' field is unused.
Remove it.

Found with cppcheck, unusedStructMember.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/d1c3a3c455f485dae46290e3488daf1dcc1d355a.1712687589.git.christophe.jaillet@wanadoo.fr
2024-04-23 12:40:29 +02:00
Aleksandr Mishin
d998ddc86a thermal/drivers/tsens: Fix null pointer dereference
compute_intercept_slope() is called from calibrate_8960() (in tsens-8960.c)
as compute_intercept_slope(priv, p1, NULL, ONE_PT_CALIB) which lead to null
pointer dereference (if DEBUG or DYNAMIC_DEBUG set).
Fix this bug by adding null pointer check.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: dfc1193d4d ("thermal/drivers/tsens: Replace custom 8960 apis with generic apis")
Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru>
Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240411114021.12203-1-amishin@t-argos.ru
2024-04-23 12:40:29 +02:00
Hsin-Te Yuan
7954c92ede thermal/drivers/mediatek/lvts_thermal: Add coeff for mt8192
In order for lvts_raw_to_temp to function properly on mt8192,
temperature coefficients for mt8192 need to be added.

Fixes: 288732242d ("thermal/drivers/mediatek/lvts_thermal: Add mt8192 support")
Signed-off-by: Hsin-Te Yuan <yuanhsinte@chromium.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240416-lvts_thermal-v2-1-f8a36882cc53@chromium.org
2024-04-23 12:40:29 +02:00
Niklas Söderlund
63d23fb781 thermal/drivers/rcar_gen3: Update temperature approximation calculation
The initial driver used a formula to approximate the temperature and
register values reversed engineered from an out-of-tree BSP driver. This
was needed as the datasheet at the time did not contain any information
on how to do this. Later Gen3 (Rev 2.30) and Gen4 (all) now contains
this information.

Update the approximation formula to use the datasheet's information
instead of the reversed-engineered one.

On an idle M3-N without fused calibration values for PTAT and THCODE the
old formula reports,

    zone0: 52000
    zone1: 53000
    zone2: 52500

While the new formula under the same circumstances reports,

    zone0: 52500
    zone1: 54000
    zone2: 54000

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240327133013.3982199-3-niklas.soderlund+renesas@ragnatech.se
2024-04-23 12:40:29 +02:00
Niklas Söderlund
566e0ea7d0 thermal/drivers/rcar_gen3: Move Tj_T storage to shared private data
The calculated Tj_T constant is calculated from the PTAT data either
read from the first TSC zone on the device if calibration data is fused,
or from fallback values in the driver itself. The value calculated is
shared among all TSC zones.

Move the Tj_T constant to the shared private data structure instead of
duplicating it in each TSC private data. This requires adding a pointer
to the shared data to the TSC private data structure. This back pointer
make it easier to further rework the temperature conversion logic.

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240327133013.3982199-2-niklas.soderlund+renesas@ragnatech.se
2024-04-23 12:40:29 +02:00
Dmitry Rokosov
7fcd7dfa5e thermal/drivers/amlogic: Support A1 SoC family Thermal Sensor controller
In comparison to other Amlogic chips, there is one key difference.
The offset for the sec_ao base, also known as u_efuse_off, is special,
while other aspects remain the same.

Signed-off-by: Dmitry Rokosov <ddrokosov@salutedevices.com>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240328191322.17551-3-ddrokosov@salutedevices.com
2024-04-23 12:40:29 +02:00
Priyansh Jain
34b9a92b68 thermal/drivers/tsens: Add suspend to RAM support for tsens
As part of suspend to RAM, tsens hardware will be turned off.
While resume callback, re-initialize tsens hardware.

Signed-off-by: Priyansh Jain <quic_priyjain@quicinc.com>
Acked-by: Amit Kucheria <amitk@kernel.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240328050230.31770-1-quic_priyjain@quicinc.com
2024-04-23 12:40:29 +02:00
Rasmus Villemoes
58b1569244 thermal/drivers/armada: Simplify name sanitization
Simplify the code by using the helper we have for doing exactly this.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240320104940.65031-1-linux@rasmusvillemoes.dk
2024-04-23 12:40:29 +02:00
Konrad Dybcio
d9d3490c48 thermal/drivers/qcom/lmh: Check for SCM availability at probe
Up until now, the necessary scm availability check has not been
performed, leading to possible null pointer dereferences (which did
happen for me on RB1).

Fix that.

Fixes: 53bca371cd ("thermal/drivers/qcom: Add support for LMh driver")
Cc: <stable@vger.kernel.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240308-topic-rb1_lmh-v2-2-bac3914b0fe3@linaro.org
2024-04-23 12:40:29 +02:00
Rafael J. Wysocki
80f5fd45c7 thermal: core: Introduce .trip_crossed() callback for thermal governors
Introduce a new thermal governor callback called .trip_crossed()
that will be invoked whenever a trip point is crossed by the zone
temperature, either on the way up or on the way down.

The trip crossing direction information will be passed to it and if
multiple trips are crossed in the same direction during one thermal zone
update, the new callback will be invoked for them in temperature order,
either ascending or descending, depending on the trip crossing
direction.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-19 22:11:06 +02:00
Rafael J. Wysocki
5c897a9a12 Merge back earlier thermal control material for v6.10. 2024-04-19 15:17:21 +02:00
Rafael J. Wysocki
b552f63cd4 thermal/debugfs: Add missing count increment to thermal_debug_tz_trip_up()
The count field in struct trip_stats, representing the number of times
the zone temperature was above the trip point, needs to be incremented
in thermal_debug_tz_trip_up(), for two reasons.

First, if a trip point is crossed on the way up for the first time,
thermal_debug_update_temp() called from update_temperature() does
not see it because it has not been added to trips_crossed[] array
in the thermal zone's struct tz_debugfs object yet.  Therefore, when
thermal_debug_tz_trip_up() is called after that, the trip point's
count value is 0, and the attempt to divide by it during the average
temperature computation leads to a divide error which causes the kernel
to crash.  Setting the count to 1 before the division by incrementing it
fixes this problem.

Second, if a trip point is crossed on the way up, but it has been
crossed on the way up already before, its count value needs to be
incremented to make a record of the fact that the zone temperature is
above the trip now.  Without doing that, if the mitigations applied
after crossing the trip cause the zone temperature to drop below its
threshold, the count will not be updated for this episode at all and
the average temperature in the trip statistics record will be somewhat
higher than it should be.

Fixes: 7ef01f228c ("thermal/debugfs: Add thermal debugfs information for mitigation episodes")
Cc :6.8+ <stable@vger.kernel.org> # 6.8+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-04-19 15:08:19 +02:00
Rafael J. Wysocki
0dbf608717 Merge branch 'thermal-intel' into thermal
* thermal-intel:
  thermal: intel: int340x_thermal: replace deprecated strncpy() with strscpy()
  thermal: intel: hfi: Enable HFI only when required
  thermal: netlink: Rename thermal_gnl_family
  thermal: netlink: Add genetlink bind/unbind notifications
2024-04-15 15:45:32 +02:00
Lukas Wunner
66bc1a1733 treewide: Use sysfs_bin_attr_simple_read() helper
Deduplicate ->read() callbacks of bin_attributes which are backed by a
simple buffer in memory:

Use the newly introduced sysfs_bin_attr_simple_read() helper instead,
either by referencing it directly or by declaring such bin_attributes
with BIN_ATTR_SIMPLE_RO() or BIN_ATTR_SIMPLE_ADMIN_RO().

Aside from a reduction of LoC, this shaves off a few bytes from vmlinux
(304 bytes on an x86_64 allyesconfig).

No functional change intended.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Acked-by: Zhi Wang <zhiwang@kernel.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/92ee0a0e83a5a3f3474845db6c8575297698933a.1712410202.git.lukas@wunner.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-11 16:02:25 +02:00
Sumeet Pawnikar
79b510c492 ACPI: DPTF: Add Lunar Lake support
Add Lunar Lake ACPI IDs for DPTF devices.

Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-04-08 16:10:27 +02:00
Rafael J. Wysocki
0444d574fb thermal: core: Relocate the struct thermal_governor definition
Notice that struct thermal_governor is only used by the thermal core
and so move its definition to thermal_core.h.

No functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-08 16:01:50 +02:00
Rafael J. Wysocki
7454f2c42c thermal: core: Sort trip point crossing notifications by temperature
If multiple trip points are crossed in one go and the trips table in
the thermal zone device object is not sorted, the corresponding trip
point crossing notifications sent to user space will not be ordered
either.

Moreover, if the trips table is sorted by trip temperature in ascending
order, the trip crossing notifications on the way up will be sent in that
order too, but the trip crossing notifications on the way down will be
sent in the reverse order.

This is generally confusing and it is better to make the kernel send the
notifications in the order of growing (on the way up) or falling (on the
way down) trip temperature.

To achieve that, instead of sending a trip crossing notification and
recording a trip crossing event in the statistics right away from
handle_thermal_trip(), put the trip in question on a list that will be
sorted by __thermal_zone_device_update() after processing all of the trips
and before sending the notifications and recording trip crossing events.

Link: https://lore.kernel.org/linux-pm/20240306085428.88011-1-daniel.lezcano@linaro.org/
Reported-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-08 16:01:20 +02:00
Rafael J. Wysocki
9ad18043fb thermal: core: Send trip crossing notifications at init time if needed
If a trip point is already exceeded by the zone temperature at the
initialization time, no trip crossing notification is send regarding
this even though mitigation should be started then.

Address this by rearranging the code in handle_thermal_trip() to
send a trip crossing notification for trip points already exceeded
by the zone temperature initially which also allows to reduce its
size by using the observation that the initialization and regular
trip crossing on the way up become the same case then.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-08 16:01:20 +02:00
Rafael J. Wysocki
f99c1b87a9 thermal: core: Rewrite comments in handle_thermal_trip()
Make the comments regarding trip crossing and threshold updates in
handle_thermal_trip() slightly more clear.

No functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-04-08 16:01:20 +02:00
Rafael J. Wysocki
b1ae92dcfa thermal: core: Make struct thermal_zone_device definition internal
Move the definitions of struct thermal_trip_desc and struct
thermal_zone_device to an internal header file in the thermal core,
as they don't need to be accessible to any code other than the thermal
core and so they don't need to be present in a global header.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-04-08 16:01:20 +02:00
Rafael J. Wysocki
daeeb032f4 thermal: core: Move threshold out of struct thermal_trip
The threshold field in struct thermal_trip is only used internally by
the thermal core and it is better to prevent drivers from misusing it.
It also takes some space unnecessarily in the trip tables passed by
drivers to the core during thermal zone registration.

For this reason, introduce struct thermal_trip_desc as a wrapper around
struct thermal_trip, move the threshold field directly into it and make
the thermal core store struct thermal_trip_desc objects in the internal
thermal zone trip tables.  Adjust all of the code using trip tables in
the thermal core accordingly.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-08 16:01:20 +02:00
Rafael J. Wysocki
053b852c46 thermal: gov_step_wise: Simplify checks related to passive trips
Make it more clear from the code flow that the passive polling status
updates only take place for passive trip points.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-08 15:57:50 +02:00
Rafael J. Wysocki
f1164d333c thermal: gov_step_wise: Simplify get_target_state()
The step-wise governor's get_target_state() function contains redundant
braces, redundant parens and a redundant next_target local variable, so
get rid of all that stuff.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-08 15:57:50 +02:00
Nikita Travkin
da781936e7 thermal: gov_power_allocator: Allow binding without trip points
IPA probe function was recently refactored to perform extra error checks
and make sure the thermal zone has trip points necessary for the IPA
operation. With this change, if a thermal zone is probed such that it
has no trip points that IPA can use, IPA will fail and the TZ won't be
created. This is the case if a platform defines a TZ without cooling
devices and only with "hot"/"critical" trip points, often found on some
Qualcomm devices [1].

Documentation across IPA code (notably get_governor_trips() kerneldoc)
suggests that IPA is supposed to handle such TZ even if it won't
actually do anything.

This commit partially reverts the previous change to allow IPA to bind
to such "empty" thermal zones.

Fixes: e83747c2f8 ("thermal: gov_power_allocator: Set up trip points earlier")
Link: arch/arm64/boot/dts/qcom/sc7180.dtsi#n4776 # [1]
Signed-off-by: Nikita Travkin <nikita@trvn.ru>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-04-03 16:32:15 +02:00
Nikita Travkin
1057c4c36e thermal: gov_power_allocator: Allow binding without cooling devices
IPA was recently refactored to split out memory allocation into a
separate funciton. That funciton was made to return -EINVAL if there is
zero power_actors and thus no memory to allocate. This causes IPA to
fail probing when the thermal zone has no attached cooling devices.

Since cooling devices can attach after the thermal zone is created and
the governer is attached to it, failing probe due to the lack of cooling
devices is incorrect.

Change the allocate_actors_buffer() to return success when there is no
cooling devices present.

Fixes: 912e97c67c ("thermal: gov_power_allocator: Move memory allocation out of throttle()")
Signed-off-by: Nikita Travkin <nikita@trvn.ru>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-04-03 16:32:14 +02:00
Ye Zhang
a26de34b3c thermal: devfreq_cooling: Fix perf state when calculate dfc res_util
The issue occurs when the devfreq cooling device uses the EM power model
and the get_real_power() callback is provided by the driver.

The EM power table is sorted ascending,can't index the table by cooling
device state,so convert cooling state to performance state by
dfc->max_state - dfc->capped_state.

Fixes: 615510fe13 ("thermal: devfreq_cooling: remove old power model and use EM")
Cc: 5.11+ <stable@vger.kernel.org> # 5.11+
Signed-off-by: Ye Zhang <ye.zhang@rock-chips.com>
Reviewed-by: Dhruva Gole <d-gole@ti.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-03-27 16:27:39 +01:00
Justin Stitt
03fa9a3ad1 thermal: intel: int340x_thermal: replace deprecated strncpy() with strscpy()
strncpy() is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.

psvt->limit.string can only be 8 bytes so let's use the appropriate size
macro ACPI_LIMIT_STR_MAX_LEN.

Neither psvt->limit.string or psvt_user[i].limit.string requires the
NUL-padding behavior that strncpy() provides as they have both been
filled with NUL-bytes prior to the string operation.
|	memset(&psvt->limit, 0, sizeof(u64));
and
| 	psvt_user = kzalloc(psvt_len, GFP_KERNEL);

Let's use `strscpy` [2] due to the fact that it guarantees
NUL-termination on the destination buffer without unnecessarily
NUL-padding.

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings # [1]
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2]
Link: https://github.com/KSPP/linux/issues/90
Signed-off-by: Justin Stitt <justinstitt@google.com>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-03-27 14:57:44 +01:00
Stanislaw Gruszka
b33f3d2677 thermal: intel: hfi: Enable HFI only when required
Enable and disable hardware feedback interface (HFI) when user space
handler is present. For example, enable HFI, when intel-speed-select or
Intel Low Power daemon is running and subscribing to thermal netlink
events. When user space handlers exit or remove subscription for
thermal netlink events, disable HFI.

Summary of changes:

 - Register a thermal genetlink notifier

 - In the notifier, process THERMAL_NOTIFY_BIND and THERMAL_NOTIFY_UNBIND
   reason codes to count number of thermal event group netlink multicast
   clients. If thermal netlink group has any listener enable HFI on all
   packages. If there are no listener disable HFI on all packages.

- When CPU is online, instead of blindly enabling HFI, check if
  the thermal netlink group has any listener. This will make sure that
  HFI is not enabled by default during boot time.

- Actual processing to enable/disable matches what is done in
  suspend/resume callbacks. Create two functions hfi_enable_instance()
  and hfi_disable_instance(), which can be called from the netlink
  notifier callback and suspend/resume callbacks.

Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-03-27 14:50:26 +01:00
Stanislaw Gruszka
afdaff3706 thermal: netlink: Rename thermal_gnl_family
Almost all thermal netlink structures use thermal_genl prefix.
Change thermal_gnl_family name accordingly for consistency.

No functional impact.

Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-03-27 14:50:26 +01:00
Stanislaw Gruszka
cf580ad490 thermal: netlink: Add genetlink bind/unbind notifications
Introduce a new feature to the thermal netlink framework, enabling the
registration of sub drivers to receive events via a notifier mechanism.
Specifically, implement genetlink family bind and unbind callbacks to send
BIND and UNBIND events.

The primary purpose of this enhancement is to facilitate the tracking of
user-space consumers by the Intel HFI driver. By leveraging these
notifications, the driver can determine when consumers are present
or absent.

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-03-27 14:50:26 +01:00
Daniel Lezcano
f67cf45dee Revert "thermal: core: Don't update trip points inside the hysteresis range"
It has been reported the commit cf3986f8c0 introduced a regression
when the temperature is wavering in the hysteresis region. The
mitigation stops leading to an uncontrolled temperature increase until
reaching the critical trip point.

Here what happens:

 * 'throttle' is when the current temperature is greater than the trip
   point temperature
 * 'target' is the mitigation level
 * 'passive' is positive when there is a mitigation, zero otherwise
 * these values are computed in the step_wise governor

Configuration:

 trip point 1: temp=95°C, hyst=5°C (passive)
 trip point 2: temp=115°C, hyst=0°C (critical)
 governor: step_wise

 1. The temperature crosses the way up the trip point 1 at 95°C

   - trend=raising
   - throttle=1, target=1
   - passive=1
   - set_trips: low=90°C, high=115°C

 2. The temperature decreases but stays in the hysteresis region at
    93°C

   - trend=dropping
   - throttle=0, target=0
   - passive=1

   Before cf3986f8c0
   - set_trips: low=90°C, high=95°C

   After cf3986f8c0
   - set_trips: low=90°C, high=115°C

 3. The temperature increases a bit but stays in the hysteresis region
    at 94°C (so below the trip point 1 temp 95°C)

   - trend=raising
   - throttle=0, target=0
   - passive=1

   Before cf3986f8c0
   - set_trips: low=90°C, high=95°C

   After cf3986f8c0
   - set_trips: low=90°C, high=115°C

 4. The temperature decreases but stays in the hysteresis region at
    93°C

   - trend=dropping
   - throttle=0, target=THERMAL_NO_TARGET
   - passive=0

   Before cf3986f8c0
   - set_trips: low=90°C, high=95°C

   After cf3986f8c0
   - set_trips: low=90°C, high=115°C

At this point, the 'passive' value is zero, there is no mitigation,
the temperature is in the hysteresis region, the next trip point is
115°C. As 'passive' is zero, the timer to monitor the thermal zone is
disabled. Consequently if the temperature continues to increase, no
mitigation will happen and it will reach the 115°C trip point and
reboot.

Before the optimization, the high boundary would have been 95°C, thus
triggering the mitigation again and rearming the polling timer.

The optimization make sense but given the current implementation of
the step_wise governor collaborating via this 'passive' flag with the
core framework it can not work.

From a higher perspective it seems like there is a problem between the
governor which sets a variable to be used by the core framework. That
sounds akward and it would make much more sense if the core framework
controls the governor and not the opposite. But as the devil hides in
the details, there are some subtilities to be addressed before.

Elaborating those would be out of the scope this changelog. So let's
stay simple and revert the change first to fixup all broken mobile
platforms.

This reverts commit cf3986f8c0 ("thermal: core: Don't update trip
points inside the hysteresis range") and takes a conflict with commit
0c0c4740c9 ("0c0c4740c9d2 thermal: trip: Use for_each_trip() in
__thermal_zone_set_trips()") in drivers/thermal/thermal_trip.c into
account.

Fixes: cf3986f8c0 ("thermal: core: Don't update trip points inside the hysteresis range")
Reported-by: Manaf Meethalavalappu Pallikunhi <quic_manafm@quicinc.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Cc: 6.7+ <stable@vger.kernel.org> # 6.7+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-03-26 13:18:13 +01:00
Rafael J. Wysocki
4e7193acde - Fix memory leak in the error path at probe time in the Mediatek LVTS
driver (Christophe Jaillet)
 
 - Fix control buffer enablement regression on Meditek MT7896 (Frank
   Wunderlich)
 
 - Drop spaces before TABs in different places: thermal-of, ST drivers
   and Makefile (Geert Uytterhoeven)
 
 - Adjust DT binding for NXP as fsl,tmu-range min/maxItems can vary
   among several SoC versions (Fabio Estevam)
 
 - Add support for H616 THS controller for the Sun8i platforms. Note
   that this change relies on another change in the SoC specific code
   which is included in this branch (Martin Botka)
 
 - Don't fail probe due to zone registration failure because there is
   no trip points defined in the DT (Mark Brown)
 
 - Support variable TMU array size for new platforms (Peng Fan)
 
 - Adjust the DT binding for thermal-of and make the polling time not
   required and assume it is zero when not found in the DT (Konrad
   Dybcio)
 
 - Add r8a779h0 support in both the DT and the driver (Geert Uytterhoeven)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGn3N4YVz0WNVyHskqDIjiipP6E8FAmXxqsUACgkQqDIjiipP
 6E9jDgf+NpBl2l9Wb8EhFcB+QRcOY79rn8KzifM4XdU1qwoog3I64lt5mAcznpbf
 7hLK9DQJYjuIx8oi/uxKaYSCno1bwuY/Vpi0/BBQ2DxhhX26C7XfZ+D4o1M3/ciZ
 /jsoaXg3WMKYVeywaHohWxjWH+rLGXxmxDlfcAXlKdAZkbEKLCzsbaRaSy/H8Xhc
 zk9Cr2RdV+HHcuAWK1TGEHUGXINSR5tk5TvY7QCfvLYsloGpSLi6vr68CAEpvBQB
 6pxW8BkAWAVQy6pgP3WthzpE+ZkEnv1vhu7fn0uoEjuxedqvdj+PKDC4svm1k/pA
 +8sV89A5RZi1zOoWWYj8TnH/K2W85w==
 =rTh4
 -----END PGP SIGNATURE-----

Merge tag 'thermal-v6.9-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux

Merge additional thermal control changes for 6.9-rc1 from Daniel Lezcano:

"- Fix memory leak in the error path at probe time in the Mediatek LVTS
   driver (Christophe Jaillet)

 - Fix control buffer enablement regression on Meditek MT7896 (Frank
   Wunderlich)

 - Drop spaces before TABs in different places: thermal-of, ST drivers
   and Makefile (Geert Uytterhoeven)

 - Adjust DT binding for NXP as fsl,tmu-range min/maxItems can vary
   among several SoC versions (Fabio Estevam)

 - Add support for H616 THS controller for the Sun8i platforms. Note
   that this change relies on another change in the SoC specific code
   which is included in this branch (Martin Botka)

 - Don't fail probe due to zone registration failure because there is
   no trip points defined in the DT (Mark Brown)

 - Support variable TMU array size for new platforms (Peng Fan)

 - Adjust the DT binding for thermal-of and make the polling time not
   required and assume it is zero when not found in the DT (Konrad
   Dybcio)

 - Add r8a779h0 support in both the DT and the driver (Geert Uytterhoeven)"

* tag 'thermal-v6.9-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux:
  thermal/drivers/rcar_gen3: Add support for R-Car V4M
  dt-bindings: thermal: rcar-gen3-thermal: Add r8a779h0 support
  thermal/of: Assume polling-delay(-passive) 0 when absent
  dt-bindings: thermal-zones: Don't require polling-delay(-passive)
  thermal/drivers/qoriq: Fix getting tmu range
  thermal/drivers/sun8i: Don't fail probe due to zone registration failure
  thermal/drivers/sun8i: Add support for H616 THS controller
  thermal/drivers/sun8i: Add SRAM register access code
  thermal/drivers/sun8i: Extend H6 calibration to support 4 sensors
  thermal/drivers/sun8i: Explain unknown H6 register value
  dt-bindings: thermal: sun8i: Add H616 THS controller
  soc: sunxi: sram: export register 0 for THS on H616
  dt-bindings: thermal: qoriq-thermal: Adjust fsl,tmu-range min/maxItems
  thermal: Drop spaces before TABs
  thermal/drivers/mediatek: Fix control buffer enablement on MT7896
  thermal/drivers/mediatek/lvts_thermal: Fix a memory leak in an error handling path
2024-03-13 20:35:48 +01:00
Linus Torvalds
259f7d5e2b Thermal control updates for 6.9-rc1
- Store zone trips table and zone operations directly in struct
    thermal_zone_device (Rafael Wysocki).
 
  - Fix up flex array initialization during thermal zone device
    registration (Nathan Chancellor).
 
  - Rework writable trip points handling in the thermal core and
    several drivers (Rafael Wysocki).
 
  - Thermal core code cleanups (Dan Carpenter, Flavio Suligoi).
 
  - Use thermal zone accessor functions in the int340x Intel thermal
    driver (Rafael Wysocki).
 
  - Add Lunar Lake-M PCI ID to the int340x Intel thermal driver (Srinivas
    Pandruvada).
 
  - Minor fixes for thermal governors (Rafael Wysocki, Di Shen).
 
  - Trip point handling fixes for the iwlwifi wireless driver (Rafael
    Wysocki).
 
  - Code cleanups (Rafael J. Wysocki, AngeloGioacchino Del Regno).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmXvJ0oSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRx63MQAIAzLFMzDqbG5bFq096tREuwtYhkQMq/
 n3ZW+FhMfSr5MnGiTCelk/6auYvijMweylxrnfQM8ilIrSWVc2fNks6PTjI/hTe6
 OUfF+nEAu+fv6I68p3evlI+IL7cncU1kygYhDRr6yxh5AFDn/BED/Klv8Ms0CkOi
 YVk6+ZCsvkcC74Tvjm9+wDJZ7XHBqKXsYCyBKqxSBmMePc0FOqDGgji2d6Q8O9Ka
 uITT9W4IhF9GNEV/ujUIrHVbfkUqJHn1sfJTOynG1Zp/MopA8mXAB2fWvkl4Kfd9
 UvgHXZnBM4jFONlqHNlaV9mTiigMaKsgU1wfaSj7fgj8DELWGII0MQfC9kcUgvlJ
 +qqmZ52tc8DKU3Lj6Wg58wgMTrI4XVAJjXwg9CTo65y6KyMuT1dkypnH95TdVtWl
 qZJ9WdxAmAbCJqZzj10kn44HrF565/t0hShrjKvv+inzDyZ5jXMttK3TQS20REsC
 MzoIxahlSUkN32OjiKhebrTNShzqFM6dxTDJLktMiInpgnnZJ/VG4Bao+NkSlLIJ
 ZwTV1xOqZZarkPVMlrOijE1bs6HbomZ7ZEsDSxvtwp+MZ06G4ICY11/KbTw9IZFv
 lCZiFNEzzxzrgqcz+5gS9y8/alknqiU5DSKCxfhbnNTW+Tk09mYPK5N0umUGwaTA
 gQ4fWsBoTpZF
 =Is/Y
 -----END PGP SIGNATURE-----

Merge tag 'thermal-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull thermal control updates from Rafael Wysocki:
 "These mostly change the thermal core in a few ways allowing thermal
  drivers to be simplified, in particular in their removal and failing
  probe handling parts that are notoriously prone to errors, and
  propagate the changes to several drivers.

  Apart from that, support for a new platform is added (Intel Lunar
  Lake-M), some bugs are fixed and some code is cleaned up, as usual.

  Specifics:

   - Store zone trips table and zone operations directly in struct
     thermal_zone_device (Rafael Wysocki)

   - Fix up flex array initialization during thermal zone device
     registration (Nathan Chancellor)

   - Rework writable trip points handling in the thermal core and
     several drivers (Rafael Wysocki)

   - Thermal core code cleanups (Dan Carpenter, Flavio Suligoi)

   - Use thermal zone accessor functions in the int340x Intel thermal
     driver (Rafael Wysocki)

   - Add Lunar Lake-M PCI ID to the int340x Intel thermal driver
     (Srinivas Pandruvada)

   - Minor fixes for thermal governors (Rafael Wysocki, Di Shen)

   - Trip point handling fixes for the iwlwifi wireless driver (Rafael
     Wysocki)

   - Code cleanups (Rafael J. Wysocki, AngeloGioacchino Del Regno)"

* tag 'thermal-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (29 commits)
  thermal: core: remove unnecessary check in trip_point_hyst_store()
  thermal: intel: int340x_thermal: Use thermal zone accessor functions
  thermal: core: Remove excess empty line from a comment
  thermal: int340x: processor_thermal: Add Lunar Lake-M PCI ID
  thermal: core: Eliminate writable trip points masks
  thermal: of: Set THERMAL_TRIP_FLAG_RW_TEMP directly
  thermal: imx: Set THERMAL_TRIP_FLAG_RW_TEMP directly
  wifi: iwlwifi: mvm: Set THERMAL_TRIP_FLAG_RW_TEMP directly
  mlxsw: core_thermal: Set THERMAL_TRIP_FLAG_RW_TEMP directly
  thermal: intel: Set THERMAL_TRIP_FLAG_RW_TEMP directly
  thermal: core: Drop the .set_trip_hyst() thermal zone operation
  thermal: core: Add flags to struct thermal_trip
  thermal: core: Move initial num_trips assignment before memcpy()
  thermal: Get rid of CONFIG_THERMAL_WRITABLE_TRIPS
  thermal: intel: Adjust ops handling during thermal zone registration
  thermal: ACPI: Constify acpi_thermal_zone_ops
  thermal: core: Store zone ops in struct thermal_zone_device
  thermal: intel: Discard trip tables after zone registration
  thermal: ACPI: Discard trips table after zone registration
  thermal: core: Store zone trips table in struct thermal_zone_device
  ...
2024-03-13 12:03:57 -07:00
Linus Torvalds
07abb19a9b Power management updates for 6.9-rc1
- Allow the Energy Model to be updated dynamically (Lukasz Luba).
 
  - Add support for LZ4 compression algorithm to the hibernation image
    creation and loading code (Nikhil V).
 
  - Fix and clean up system suspend statistics collection (Rafael
    Wysocki).
 
  - Simplify device suspend and resume handling in the power management
    core code (Rafael Wysocki).
 
  - Fix PCI hibernation support description (Yiwei Lin).
 
  - Make hibernation take set_memory_ro() return values into account as
    appropriate (Christophe Leroy).
 
  - Set mem_sleep_current during kernel command line setup to avoid an
    ordering issue with handling it (Maulik Shah).
 
  - Fix wake IRQs handling when pm_runtime_force_suspend() is used as a
    driver's system suspend callback (Qingliang Li).
 
  - Simplify pm_runtime_get_if_active() usage and add a replacement for
    pm_runtime_put_autosuspend() (Sakari Ailus).
 
  - Add a tracepoint for runtime_status changes tracking (Vilas Bhat).
 
  - Fix section title markdown in the runtime PM documentation (Yiwei
    Lin).
 
  - Enable preferred core support in the amd-pstate cpufreq driver (Meng
    Li).
 
  - Fix min_perf assignment in amd_pstate_adjust_perf() and make the
    min/max limit perf values in amd-pstate always stay within the
    (highest perf, lowest perf) range (Tor Vic, Meng Li).
 
  - Allow intel_pstate to assign model-specific values to strings used in
    the EPP sysfs interface and make it do so on Meteor Lake (Srinivas
    Pandruvada).
 
  - Drop long-unused cpudata::prev_cummulative_iowait from the
    intel_pstate cpufreq driver (Jiri Slaby).
 
  - Prevent scaling_cur_freq from exceeding scaling_max_freq when the
    latter is an inefficient frequency (Shivnandan Kumar).
 
  - Change default transition delay in cpufreq to 2ms (Qais Yousef).
 
  - Remove references to 10ms minimum sampling rate from comments in the
    cpufreq code (Pierre Gondois).
 
  - Honour transition_latency over transition_delay_us in cpufreq (Qais
    Yousef).
 
  - Stop unregistering cpufreq cooling on CPU hot-remove (Viresh Kumar).
 
  - General enhancements / cleanups to ARM cpufreq drivers (tianyu2,
    Nícolas F. R. A. Prado, Erick Archer, Arnd Bergmann, Anastasia
    Belova).
 
  - Update cpufreq-dt-platdev to block/approve devices (Richard Acayan).
 
  - Make the SCMI cpufreq driver get a transition delay value from
    firmware (Pierre Gondois).
 
  - Prevent the haltpoll cpuidle governor from shrinking guest
    poll_limit_ns below grow_start (Parshuram Sangle).
 
  - Avoid potential overflow in integer multiplication when computing
    cpuidle state parameters (C Cheng).
 
  - Adjust MWAIT hint target C-state computation in the ACPI cpuidle
    driver and in intel_idle to return a correct value for C0 (He
    Rongguang).
 
  - Address multiple issues in the TPMI RAPL driver and add support for
    new platforms (Lunar Lake-M, Arrow Lake) to Intel RAPL (Zhang Rui).
 
  - Fix freq_qos_add_request() return value check in dtpm_cpu (Daniel
    Lezcano).
 
  - Fix kernel-doc for dtpm_create_hierarchy() (Yang Li).
 
  - Fix file leak in get_pkg_num() in x86_energy_perf_policy (Samasth
    Norway Ananda).
 
  - Fix cpupower-frequency-info.1 man page typo (Jan Kratochvil).
 
  - Fix a couple of warnings in the OPP core code related to W=1
    builds (Viresh Kumar).
 
  - Move dev_pm_opp_{init|free}_cpufreq_table() to pm_opp.h (Viresh
    Kumar).
 
  - Extend dev_pm_opp_data with turbo support (Sibi Sankar).
 
  - dt-bindings: drop maxItems from inner items (David Heidelberg).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmXvI/ISHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRx24sP/jxg6fOGme8raHQvpTXG3/H56wlGzQ4P
 YUvvKUXnfD3yf1zNISsUl7VQebZqDt8rygkwSdymXlUVZX1eubN0RpCFc0F8GZuc
 THG/YQhYQr/9zro3FpKhfDj5evk21PCQzjf+dGvfQF9qVMxNPG1JzEFK6PnolT5X
 2BvkonY1XFWZjCMbZ83B/jt35lTDb0cmeNbCpfD5UJgcnxmMOtZYpORdyfPWTJpG
 GVCwmAFVVXxXlust/AIpt3mmOpKzSA9GnrtJkhtQe5GN+Y4OjnJiFJmTC7EfCctj
 JlWgVUA716mtFMUrjXgjfI54firF2oQpqaSa2HG/V/A96JWQqjarGz5dAV1IrPEt
 ZmYpvMe4E90S411wF1OWyrEqjXUuDnH1OWUvUdWSt4E7DhFw3esDi/jLW2tyVKAT
 hIy+/O4wzbDSTX/h9Cgt1Qjhew6lKUIwvhEXclB3fuJ+JoviWNkC9lnK93e2H0A3
 VYfkd/lpUD74035l0FrCJ/49MjX9kqrsn+TipHsIlSXAi8ZRdKbVvxOTD8RYudcI
 GvCiDDrkMgNwGlyedgbtTBUepCvSg93b+vVmRj7YMPtBhioOUo3qCn6wpqhxfnth
 9BCnPW7JxqUw/NJdlk9hKumaUZq+MK8G+kdYcIDg6xmAkWSUVP2QKlWavfMCxqRP
 +dN6T2iHsKFe
 =UePT
 -----END PGP SIGNATURE-----

Merge tag 'pm-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "From the functional perspective, the most significant change here is
  the addition of support for Energy Models that can be updated
  dynamically at run time.

  There is also the addition of LZ4 compression support for hibernation,
  the new preferred core support in amd-pstate, new platforms support in
  the Intel RAPL driver, new model-specific EPP handling in intel_pstate
  and more.

  Apart from that, the cpufreq default transition delay is reduced from
  10 ms to 2 ms (along with some related adjustments), the system
  suspend statistics code undergoes a significant rework and there is a
  usual bunch of fixes and code cleanups all over.

  Specifics:

   - Allow the Energy Model to be updated dynamically (Lukasz Luba)

   - Add support for LZ4 compression algorithm to the hibernation image
     creation and loading code (Nikhil V)

   - Fix and clean up system suspend statistics collection (Rafael
     Wysocki)

   - Simplify device suspend and resume handling in the power management
     core code (Rafael Wysocki)

   - Fix PCI hibernation support description (Yiwei Lin)

   - Make hibernation take set_memory_ro() return values into account as
     appropriate (Christophe Leroy)

   - Set mem_sleep_current during kernel command line setup to avoid an
     ordering issue with handling it (Maulik Shah)

   - Fix wake IRQs handling when pm_runtime_force_suspend() is used as a
     driver's system suspend callback (Qingliang Li)

   - Simplify pm_runtime_get_if_active() usage and add a replacement for
     pm_runtime_put_autosuspend() (Sakari Ailus)

   - Add a tracepoint for runtime_status changes tracking (Vilas Bhat)

   - Fix section title markdown in the runtime PM documentation (Yiwei
     Lin)

   - Enable preferred core support in the amd-pstate cpufreq driver
     (Meng Li)

   - Fix min_perf assignment in amd_pstate_adjust_perf() and make the
     min/max limit perf values in amd-pstate always stay within the
     (highest perf, lowest perf) range (Tor Vic, Meng Li)

   - Allow intel_pstate to assign model-specific values to strings used
     in the EPP sysfs interface and make it do so on Meteor Lake
     (Srinivas Pandruvada)

   - Drop long-unused cpudata::prev_cummulative_iowait from the
     intel_pstate cpufreq driver (Jiri Slaby)

   - Prevent scaling_cur_freq from exceeding scaling_max_freq when the
     latter is an inefficient frequency (Shivnandan Kumar)

   - Change default transition delay in cpufreq to 2ms (Qais Yousef)

   - Remove references to 10ms minimum sampling rate from comments in
     the cpufreq code (Pierre Gondois)

   - Honour transition_latency over transition_delay_us in cpufreq (Qais
     Yousef)

   - Stop unregistering cpufreq cooling on CPU hot-remove (Viresh Kumar)

   - General enhancements / cleanups to ARM cpufreq drivers (tianyu2,
     Nícolas F. R. A. Prado, Erick Archer, Arnd Bergmann, Anastasia
     Belova)

   - Update cpufreq-dt-platdev to block/approve devices (Richard Acayan)

   - Make the SCMI cpufreq driver get a transition delay value from
     firmware (Pierre Gondois)

   - Prevent the haltpoll cpuidle governor from shrinking guest
     poll_limit_ns below grow_start (Parshuram Sangle)

   - Avoid potential overflow in integer multiplication when computing
     cpuidle state parameters (C Cheng)

   - Adjust MWAIT hint target C-state computation in the ACPI cpuidle
     driver and in intel_idle to return a correct value for C0 (He
     Rongguang)

   - Address multiple issues in the TPMI RAPL driver and add support for
     new platforms (Lunar Lake-M, Arrow Lake) to Intel RAPL (Zhang Rui)

   - Fix freq_qos_add_request() return value check in dtpm_cpu (Daniel
     Lezcano)

   - Fix kernel-doc for dtpm_create_hierarchy() (Yang Li)

   - Fix file leak in get_pkg_num() in x86_energy_perf_policy (Samasth
     Norway Ananda)

   - Fix cpupower-frequency-info.1 man page typo (Jan Kratochvil)

   - Fix a couple of warnings in the OPP core code related to W=1 builds
     (Viresh Kumar)

   - Move dev_pm_opp_{init|free}_cpufreq_table() to pm_opp.h (Viresh
     Kumar)

   - Extend dev_pm_opp_data with turbo support (Sibi Sankar)

   - dt-bindings: drop maxItems from inner items (David Heidelberg)"

* tag 'pm-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (95 commits)
  dt-bindings: opp: drop maxItems from inner items
  OPP: debugfs: Fix warning around icc_get_name()
  OPP: debugfs: Fix warning with W=1 builds
  cpufreq: Move dev_pm_opp_{init|free}_cpufreq_table() to pm_opp.h
  OPP: Extend dev_pm_opp_data with turbo support
  Fix cpupower-frequency-info.1 man page typo
  cpufreq: scmi: Set transition_delay_us
  firmware: arm_scmi: Populate fast channel rate_limit
  firmware: arm_scmi: Populate perf commands rate_limit
  cpuidle: ACPI/intel: fix MWAIT hint target C-state computation
  PM: sleep: wakeirq: fix wake irq warning in system suspend
  powercap: dtpm: Fix kernel-doc for dtpm_create_hierarchy() function
  cpufreq: Don't unregister cpufreq cooling on CPU hotplug
  PM: suspend: Set mem_sleep_current during kernel command line setup
  cpufreq: Honour transition_latency over transition_delay_us
  cpufreq: Limit resolving a frequency to policy min/max
  Documentation: PM: Fix runtime_pm.rst markdown syntax
  cpufreq: amd-pstate: adjust min/max limit perf
  cpufreq: Remove references to 10ms min sampling rate
  cpufreq: intel_pstate: Update default EPPs for Meteor Lake
  ...
2024-03-13 11:40:06 -07:00
Geert Uytterhoeven
1828c1c17b thermal/drivers/rcar_gen3: Add support for R-Car V4M
Add support for the Thermal Sensor/Chip Internal Voltage Monitor/Core
Voltage Monitor (THS/CIVM/CVM) on the Renesas R-Car V4M (R8A779H0) SoC.

The conversion formulas for R-Car V4M are the same as for other R-Car
Gen4 SoCs.

Based on a patch in the BSP by Duy Nguyen.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/bd5b002a802c1e058e0048592f17862db1d04263.1709722342.git.geert+renesas@glider.be
2024-03-11 17:14:46 +01:00
Konrad Dybcio
488164006a thermal/of: Assume polling-delay(-passive) 0 when absent
Currently, thermal zones associated with providers that have interrupts
for signaling hot/critical trips are required to set a polling-delay
of 0 to indicate no polling. This feels a bit backwards.

Change the code such that "no polling delay" also means "no polling".

Suggested-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240125-topic-thermal-v1-2-3c9d4dced138@linaro.org
2024-03-11 17:14:46 +01:00
Peng Fan
4d0642074c thermal/drivers/qoriq: Fix getting tmu range
TMU Version 1 has 4 TTRCRs, while TMU Version >=2 has 16 TTRCRs.
So limit the len to 4 will report "invalid range data" for i.MX93.

This patch drop the local array with allocated ttrcr array and
able to support larger tmu ranges.

Fixes: f12d60c81f ("thermal/drivers/qoriq: Support version 2.1")
Tested-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240226003657.3012880-1-peng.fan@oss.nxp.com
2024-03-11 17:14:46 +01:00
Mark Brown
79d998421d thermal/drivers/sun8i: Don't fail probe due to zone registration failure
Currently the sun8i thermal driver will fail to probe if any of the
thermal zones it is registering fails to register with the thermal core.
Since we currently do not define any trip points for the GPU thermal
zones on at least A64 or H5 this means that we have no thermal support
on these platforms:

[    1.698703] thermal_sys: Failed to find 'trips' node
[    1.698707] thermal_sys: Failed to find trip points for thermal-sensor id=1

even though the main CPU thermal zone on both SoCs is fully configured.
This does not seem ideal, while we may not be able to use all the zones
it seems better to have those zones which are usable be operational.
Instead just carry on registering zones if we get any non-deferral
error, allowing use of those zones which are usable.

This means that we also need to update the interrupt handler to not
attempt to notify the core for events on zones which we have not
registered, I didn't see an ability to mask individual interrupts and
I would expect that interrupts would still be indicated in the ISR even
if they were masked.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240123-thermal-sun8i-registration-v3-1-3e5771b1bbdd@kernel.org
2024-03-11 17:14:46 +01:00
Martin Botka
83620b3b04 thermal/drivers/sun8i: Add support for H616 THS controller
Add support for the thermal sensor found in H616 SoCs, is the same as
the H6 thermal sensor controller, but with four sensors.
Also the registers readings are wrong, unless a bit in the first SYS_CFG
register cleared, so set exercise the SRAM regmap to take care of that.

Signed-off-by: Martin Botka <martin.botka@somainline.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240219153639.179814-7-andre.przywara@arm.com
2024-03-11 17:14:46 +01:00
Andre Przywara
bd0b451bd4 thermal/drivers/sun8i: Add SRAM register access code
The Allwinner H616 SoC needs to clear a bit in one register in the SRAM
controller, to report reasonable temperature values. On reset, bit 16 in
register 0x3000000 is set, which leads to the driver reporting
temperatures around 200C. Clearing this bit brings the values down to the
expected range. The BSP code does a one-time write in U-Boot, with a
comment just mentioning the effect on the THS, but offering no further
explanation.

To not rely on firmware to set things up for us, add code that queries
the SRAM controller device via a DT phandle link, then clear just this
single bit.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240219153639.179814-6-andre.przywara@arm.com
2024-03-11 17:14:46 +01:00
Maksim Kiselev
63e39fcaa6 thermal/drivers/sun8i: Extend H6 calibration to support 4 sensors
The H616 SoC resembles the H6 thermal sensor controller, with a few
changes like four sensors.

Extend sun50i_h6_ths_calibrate() function to support calibration of
these sensors.

Co-developed-by: Martin Botka <martin.botka@somainline.org>
Signed-off-by: Martin Botka <martin.botka@somainline.org>
Signed-off-by: Maksim Kiselev <bigunclemax@gmail.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Acked-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240219153639.179814-5-andre.przywara@arm.com
2024-03-11 17:14:46 +01:00
Andre Przywara
d7bf0a1ada thermal/drivers/sun8i: Explain unknown H6 register value
So far we were ORing in some "unknown" value into the THS control
register on the Allwinner H6. This part of the register is not explained
in the H6 manual, but the H616 manual details those bits, and on closer
inspection the THS IP blocks in both SoCs seem very close:
- The BSP code for both SoCs writes the same values into THS_CTRL.
- The reset values of at least the first three registers are the same.

Replace the "unknown" value with its proper meaning: "acquire time",
most probably the sample part of the sample & hold circuit of the ADC,
according to its explanation in the H616 manual.

No functional change, just a macro rename and adjustment.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Acked-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240219153639.179814-4-andre.przywara@arm.com
2024-03-11 17:14:46 +01:00
Geert Uytterhoeven
f492d8220f thermal: Drop spaces before TABs
There is never a need to have a space before a TAB, but it hurts the
eyes of vim users.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/480478a53fd42621e97b2db36e181903cc0f53e3.1708001426.git.geert+renesas@glider.be
2024-03-11 17:14:46 +01:00
Frank Wunderlich
371ed6263e thermal/drivers/mediatek: Fix control buffer enablement on MT7896
Reading thermal sensor on mt7986 devices returns invalid temperature:

bpi-r3 ~ # cat /sys/class/thermal/thermal_zone0/temp
 -274000

Fix this by adding missing members in mtk_thermal_data struct which were
used in mtk_thermal_turn_on_buffer after commit 33140e668b.

Cc: stable@vger.kernel.org
Fixes: 33140e668b ("thermal/drivers/mediatek: Control buffer enablement tweaks")
Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Reviewed-by: Markus Schneider-Pargmann <msp@baylibre.com>
Reviewed-by: Daniel Golle <daniel@makrotopia.org>
Tested-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20230907112018.52811-1-linux@fw-web.de
2024-03-11 17:14:46 +01:00
Christophe JAILLET
ca93bf607a thermal/drivers/mediatek/lvts_thermal: Fix a memory leak in an error handling path
If devm_krealloc() fails, then 'efuse' is leaking.
So free it to avoid a leak.

Fixes: f5f633b182 ("thermal/drivers/mediatek: Add the Low Voltage Thermal Sensor driver")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/481d345233862d58c3c305855a93d0dbc2bbae7e.1706431063.git.christophe.jaillet@wanadoo.fr
2024-03-11 17:14:46 +01:00
Rafael J. Wysocki
3bd834640b Merge branch 'pm-em'
Merge Enery Model changes for 6.9-rc1:

 - Allow the Energy Model to be updated dynamically (Lukasz Luba).

* pm-em: (24 commits)
  PM: EM: Fix nr_states warnings in static checks
  Documentation: EM: Update with runtime modification design
  PM: EM: Add em_dev_compute_costs()
  PM: EM: Remove old table
  PM: EM: Change debugfs configuration to use runtime EM table data
  drivers/thermal/devfreq_cooling: Use new Energy Model interface
  drivers/thermal/cpufreq_cooling: Use new Energy Model interface
  powercap/dtpm_devfreq: Use new Energy Model interface to get table
  powercap/dtpm_cpu: Use new Energy Model interface to get table
  PM: EM: Optimize em_cpu_energy() and remove division
  PM: EM: Support late CPUs booting and capacity adjustment
  PM: EM: Add performance field to struct em_perf_state and optimize
  PM: EM: Add em_perf_state_from_pd() to get performance states table
  PM: EM: Introduce em_dev_update_perf_domain() for EM updates
  PM: EM: Add functions for memory allocations for new EM tables
  PM: EM: Use runtime modified EM for CPUs energy estimation in EAS
  PM: EM: Introduce runtime modifiable table
  PM: EM: Split the allocation and initialization of the EM table
  PM: EM: Check if the get_cost() callback is present in em_compute_costs()
  PM: EM: Introduce em_compute_costs()
  ...
2024-03-11 15:59:51 +01:00
Rafael J. Wysocki
dcb497ec99 Merge branches 'thermal-core' and 'thermal-intel'
Merge thermal core changes and Intel thermal drivers changes for
6.9-rc1:

 - Store zone trips table and zone operations directly in struct
   thermal_zone_device (Rafael Wysocki).

 - Rework writable trip points handling (Rafael Wysocki).

 - Thermal core code cleanups (Dan Carpenter, Flavio Suligoi).

 - Use thermal zone accessor functions in the int340x Intel thermal
   driver (Rafael Wysocki).

 - Add Lunar Lake-M PCI ID to the int340x Intel thermal driver (Srinivas
   Pandruvada).

* thermal-core:
  thermal: core: remove unnecessary check in trip_point_hyst_store()
  thermal: core: Remove excess empty line from a comment
  thermal: core: Eliminate writable trip points masks
  thermal: of: Set THERMAL_TRIP_FLAG_RW_TEMP directly
  thermal: imx: Set THERMAL_TRIP_FLAG_RW_TEMP directly
  wifi: iwlwifi: mvm: Set THERMAL_TRIP_FLAG_RW_TEMP directly
  mlxsw: core_thermal: Set THERMAL_TRIP_FLAG_RW_TEMP directly
  thermal: intel: Set THERMAL_TRIP_FLAG_RW_TEMP directly
  thermal: core: Drop the .set_trip_hyst() thermal zone operation
  thermal: core: Add flags to struct thermal_trip
  thermal: core: Move initial num_trips assignment before memcpy()
  thermal: Get rid of CONFIG_THERMAL_WRITABLE_TRIPS
  thermal: intel: Adjust ops handling during thermal zone registration
  thermal: ACPI: Constify acpi_thermal_zone_ops
  thermal: core: Store zone ops in struct thermal_zone_device
  thermal: intel: Discard trip tables after zone registration
  thermal: ACPI: Discard trips table after zone registration
  thermal: core: Store zone trips table in struct thermal_zone_device

* thermal-intel:
  thermal: intel: int340x_thermal: Use thermal zone accessor functions
  thermal: int340x: processor_thermal: Add Lunar Lake-M PCI ID
2024-03-07 21:05:12 +01:00
Dan Carpenter
59d894a078 thermal: core: remove unnecessary check in trip_point_hyst_store()
This code was shuffled around a bit recently.  We no longer need to
check the value of "ret" because we know it's zero.

Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-03-06 13:26:25 +01:00
Rafael J. Wysocki
53b94f421d thermal: intel: int340x_thermal: Use thermal zone accessor functions
Make int340x_thermal use the dedicated accessor functions for the
thermal zone device object address and the thermal zone type string.

This is requisite for future thermal core improvements.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2024-03-05 21:38:22 +01:00
Rafael J. Wysocki
166d017d34 Merge thermal core changes for 6.9 to satisfy a dependency. 2024-03-05 21:37:49 +01:00
Flavio Suligoi
32abd25087 thermal: core: Remove excess empty line from a comment
The first and the third lines of the kerneldoc comment for:

thermal_zone_device_set_polling()

belong to the same sentences, so join them together.

Signed-off-by: Flavio Suligoi <f.suligoi@asem.it>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-03-05 13:17:33 +01:00
Srinivas Pandruvada
f1f0c44522 thermal: int340x: processor_thermal: Add Lunar Lake-M PCI ID
Add Lunar Lake-M PCI ID for processor thermal device.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-02-28 20:46:51 +01:00
Rafael J. Wysocki
4a62d588a8 thermal: core: Eliminate writable trip points masks
All of the thermal_zone_device_register_with_trips() callers pass zero
writable trip points masks to it, so drop the mask argument from that
function and update all of its callers accordingly.

This also removes the artificial trip points per zone limit of 32,
related to using writable trip points masks.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-02-27 12:04:38 +01:00
Rafael J. Wysocki
83c2d444ed thermal: of: Set THERMAL_TRIP_FLAG_RW_TEMP directly
It is now possible to flag trip points with THERMAL_TRIP_FLAG_RW_TEMP
to allow their temperature to be set from user space via sysfs instead
of using a nonzero writable trips mask during thermal zone registration,
so make the OF thermal code do that.

No intentional functional impact.

Note that this change is requisite for dropping the mask argument from
thermal_zone_device_register_with_trips() going forward.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-02-27 12:04:01 +01:00
Rafael J. Wysocki
68e9c60353 thermal: imx: Set THERMAL_TRIP_FLAG_RW_TEMP directly
It is now possible to flag trip points with THERMAL_TRIP_FLAG_RW_TEMP
to allow their temperature to be set from user space via sysfs instead
of using a nonzero writable trips mask during thermal zone registration,
so make the imx thermal code do that.

No intentional functional impact.

Note that this change is requisite for dropping the mask argument from
thermal_zone_device_register_with_trips() going forward.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-02-27 12:04:01 +01:00
Rafael J. Wysocki
cca52f6969 thermal: intel: Set THERMAL_TRIP_FLAG_RW_TEMP directly
Some Intel thermal drivers need/want the temperature of their trip
points to be set by user space via sysfs and so they pass nonzero
writable trip masks during thermal zone registration for this purpose.

It is now possible to achieve the same result by setting the
THERMAL_TRIP_FLAG_RW_TEMP trip flag directly, so modify the drivers
in question to do that instead of using a nonzero writable trips mask.

No intentional functional impact.

Note that this change is requisite for dropping the mask argument from
thermal_zone_device_register_with_trips() going forward.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-02-27 12:04:01 +01:00
Rafael J. Wysocki
46f5bef8ec thermal: core: Drop the .set_trip_hyst() thermal zone operation
None of the users of the thermal core provides a .set_trip_hyst()
thermal zone operation, so drop that callback from struct
thermal_zone_device_ops and update trip_point_hyst_store()
accordingly.

No functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-02-27 12:04:01 +01:00
Rafael J. Wysocki
5340f76472 thermal: core: Add flags to struct thermal_trip
In order to allow thermal zone creators to specify the writability of
trip point temperature and hysteresis on a per-trip basis, add a flags
field to struct thermal_trip and define flags to represent the desired
trip properties.

Also make thermal_zone_device_register_with_trips() set the
THERMAL_TRIP_FLAG_RW_TEMP flag for all trips covered by the writable
trips mask passed to it and modify the thermal sysfs code to look at
the trip flags instead of using the writable trips mask directly or
checking the presence of the .set_trip_hyst() zone callback.

Additionally, make trip_point_temp_store() and trip_point_hyst_store()
fail with an error code if the trip passed to one of them has
THERMAL_TRIP_FLAG_RW_TEMP or THERMAL_TRIP_FLAG_RW_HYST,
respectively, clear in its flags.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-02-27 12:02:50 +01:00
Nathan Chancellor
da1983355c thermal: core: Move initial num_trips assignment before memcpy()
When booting a CONFIG_FORTIFY_SOURCE=y kernel compiled with a toolchain
that supports __counted_by() (such as clang-18 and newer), there is a
panic on boot:

  [    2.913770] memcpy: detected buffer overflow: 72 byte write of buffer size 0
  [    2.920834] WARNING: CPU: 2 PID: 1 at lib/string_helpers.c:1027 __fortify_report+0x5c/0x74
  ...
  [    3.039208] Call trace:
  [    3.041643]  __fortify_report+0x5c/0x74
  [    3.045469]  __fortify_panic+0x18/0x20
  [    3.049209]  thermal_zone_device_register_with_trips+0x4c8/0x4f8

This panic occurs because trips is counted by num_trips but num_trips is
assigned after the call to memcpy(), so the fortify checks think the
buffer size is zero because tz was allocated with kzalloc().

Move the num_trips assignment before the memcpy() to resolve the panic
and ensure that the fortify checks work properly.

Fixes: 9b0a627586 ("thermal: core: Store zone trips table in struct thermal_zone_device")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-02-27 11:58:47 +01:00
Rafael J. Wysocki
a85739c8c6 thermal: Get rid of CONFIG_THERMAL_WRITABLE_TRIPS
The only difference made by CONFIG_THERMAL_WRITABLE_TRIPS is whether or
not the writable trips mask passed during thermal zone registration
will take any effect, but whoever passes a non-zero writable trips mask
to thermal_zone_device_register_with_trips() can be forgiven thinking
that it will always work.

Moreover, some thermal drivers expect user space to set trip temperature
values, so they select CONFIG_THERMAL_WRITABLE_TRIPS, possibly overriding
a manual choice to unset it and going against the design purportedly
allowing system integrators to decide on the writability of trip points
for the given kernel build.  It is also set in one platform's defconfig.

Forthermore, CONFIG_THERMAL_WRITABLE_TRIPS only affects trip temperature,
because trip hysteresis is writable as long as the thermal zone provides
a callback to update it, regardless of the CONFIG_THERMAL_WRITABLE_TRIPS
value.

The above means that the symbol in question is used inconsistently and
its purpose is at least moot, so remove it and always take the writable
trip mask passed to thermal_zone_device_register_with_trips() into
account.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-02-23 18:24:48 +01:00
Rafael J. Wysocki
62dd17846d thermal: intel: Adjust ops handling during thermal zone registration
Because thermal zone operations are now stored directly in struct
thermal_zone_device, thermal zone creators can discard the operations
structure after the zone registration is complete, or it can be made
read-only.

Accordingly, make int340x_thermal_zone_add() use a local variable to
represent thermal zone operations, so it is freed automatically upon the
function exit, and make the other Intel thermal drivers use const zone
operations structures.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-02-23 18:24:48 +01:00
Rafael J. Wysocki
698a1eb1f7 thermal: core: Store zone ops in struct thermal_zone_device
The current code requires thermal zone creators to pass pointers to
writable ops structures to thermal_zone_device_register_with_trips()
which needs to modify the target struct thermal_zone_device_ops object
if the "critical" operation in it is NULL.

Moreover, the callers of thermal_zone_device_register_with_trips() are
required to hold on to the struct thermal_zone_device_ops object passed
to it until the given thermal zone is unregistered.

Both of these requirements are quite inconvenient, so modify struct
thermal_zone_device to contain struct thermal_zone_device_ops as field and
make thermal_zone_device_register_with_trips() copy the contents of the
struct thermal_zone_device_ops passed to it via a pointer (which can be
const now) to that field.

Also adjust the code using thermal zone ops accordingly and modify
thermal_of_zone_register() to use a local ops variable during
thermal zone registration so ops do not need to be freed in
thermal_of_zone_unregister() any more.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-02-23 18:24:48 +01:00
Rafael J. Wysocki
fcbf878000 thermal: intel: Discard trip tables after zone registration
Because the thermal core creates and uses its own copy of the trips
table passed to thermal_zone_device_register_with_trips(), it is not
necessary to hold on to a local copy of it any more after the given
thermal zone has been registered.

Accordingly, modify Intel thermal drivers to discard the trips tables
passed to thermal_zone_device_register_with_trips() after thermal zone
registration, for example by storing them in local variables which are
automatically discarded when the zone registration is complete.

Also make some additional code simplifications unlocked by the above
changes.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-02-23 18:24:47 +01:00
Rafael J. Wysocki
9b0a627586 thermal: core: Store zone trips table in struct thermal_zone_device
The current code expects thermal zone creators to pass a pointer to a
writable trips table to thermal_zone_device_register_with_trips() and
that trips table is then used by the thermal core going forward.

Consequently, the callers of thermal_zone_device_register_with_trips()
are required to hold on to the trips table passed to it until the given
thermal zone is unregistered, at which point the trips table can be
freed, but at the same time they are not expected to access that table
directly.  This is both error prone and confusing.

To address it, turn the trips table pointer in struct thermal_zone_device
into a flex array (counted by its num_trips field), allocate it during
thermal zone device allocation and copy the contents of the trips table
supplied by the zone creator (which can be const now) into it, which
will allow the callers of thermal_zone_device_register_with_trips() to
drop their trip tables right after the zone registration.

This requires the imx thermal driver to be adjusted to store the new
temperature in its internal trips table in imx_set_trip_temp(), because
it will be separate from the core's trips table now and it has to be
explicitly kept in sync with the latter.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-02-23 18:24:47 +01:00
Rafael J. Wysocki
2c8459a568 Merge branch 'thermal-core'
Merge thermal core changes for 6.9:

 - Minor fixes for thermal governors (Rafael J. Wysocki, Di Shen).

 - Trip point handling fixes for the iwlwifi wireless driver (Rafael J.
   Wysocki).

 - Code cleanups (Rafael J. Wysocki, AngeloGioacchino Del Regno).

* thermal-tmp:
  thermal: gov_power_allocator: Avoid overwriting PID coefficients from setup time
  thermal: sysfs: Fix up white space in trip_point_temp_store()
  iwlwifi: mvm: Use for_each_thermal_trip() for walking trip points
  iwlwifi: mvm: Populate trip table before registering thermal zone
  iwlwifi: mvm: Drop unused fw_trips_index[] from iwl_mvm_thermal_device
  thermal: core: Change governor name to const char pointer
  thermal: gov_bang_bang: Fix possible cooling device state ping-pong
  thermal: gov_fair_share: Fix dependency on trip points ordering
2024-02-23 18:22:46 +01:00
Thomas Gleixner
bd745d1c41 x86/cpu/topology: Rename topology_max_die_per_package()
The plural of die is dies.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20240213210253.065874205@linutronix.de
2024-02-15 22:07:45 +01:00
Zhang Rui
1aa09b9379 powercap: intel_rapl: Fix locking in TPMI RAPL
The RAPL framework uses CPU hotplug locking to protect the rapl_packages
list and rp->lead_cpu to guarantee that

 1. the RAPL package device is not unprobed and freed
 2. the cached rp->lead_cpu is always valid

for operations like powercap sysfs accesses.

Current RAPL APIs assume being called from CPU hotplug callbacks which
hold the CPU hotplug lock, but TPMI RAPL driver invokes the APIs in the
driver's .probe() function without acquiring the CPU hotplug lock.

Fix the problem by providing both locked and lockless versions of RAPL
APIs.

Fixes: 9eef7f9da9 ("powercap: intel_rapl: Introduce RAPL TPMI interface driver")
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Cc: 6.5+ <stable@vger.kernel.org> # 6.5+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-02-13 17:31:48 +01:00
Zhang Rui
7251b9e8a0 thermal/intel: Fix intel_tcc_get_temp() to support negative CPU temperature
CPU temperature can be negative in some cases. Thus the negative CPU
temperature should not be considered as a failure.

Fix intel_tcc_get_temp() and its users to support negative CPU
temperature.

Fixes: a3c1f066e1 ("thermal/intel: Introduce Intel TCC library")
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Cc: 6.3+ <stable@vger.kernel.org> # 6.3+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-02-12 18:41:38 +01:00
Di Shen
0fac6893ff thermal: gov_power_allocator: Avoid overwriting PID coefficients from setup time
When the PID coefficients k_* are set via sysfs before the IPA
algorithm is triggered then the coefficients would be overwritten after
IPA throttle() is called. The old configuration values might be
different than the new values estimated by the IPA internal algorithm.

There might be a time delay when this overwriting happens. It
depends on the thermal zone temperature value. The temperature value
needs to cross the first trip point value then IPA algorithms start
operating. Although, the PID coefficients setup time should not be
affected or linked to any later operating phase and values must not be
overwritten.

This patch initializes params->sustainable_power when the governor
binds to thermal zone to avoid overwriting k_*. The basic function won't
be affected, as the k_* still can be estimated if the sustainable_power
is modified.

Signed-off-by: Di Shen <di.shen@unisoc.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-02-12 14:13:36 +01:00
Rafael J. Wysocki
ccd975daa8 thermal: sysfs: Fix up white space in trip_point_temp_store()
Remove an excess tab character from an otherwise empty code line.

No functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
2024-02-12 13:58:14 +01:00
Lukasz Luba
9f5fb518c3 drivers/thermal/devfreq_cooling: Use new Energy Model interface
Energy Model framework support modifications at runtime of the power
values. Use the new EM table which is protected with RCU. Align the
code so that this RCU read section is short.

This change is not expected to alter the general functionality.

Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-02-08 15:00:32 +01:00
Lukasz Luba
207472b8ef drivers/thermal/cpufreq_cooling: Use new Energy Model interface
Energy Model framework support modifications at runtime of the power
values. Use the new EM table which is protected with RCU. Align the
code so that this RCU read section is short.

This change is not expected to alter the general functionality.

Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-02-08 15:00:31 +01:00
Rafael J. Wysocki
54d94009cb thermal: gov_bang_bang: Fix possible cooling device state ping-pong
The current behavior of thermal_zone_trip_update() in the bang-bang
thermal governor may be problematic for trip points with 0 hysteresis,
because when the zone temperature reaches the trip temperature and
stays there, it will then cause the cooling device go "on" and "off"
alternately, which is not desirable.

Address this by requiring the zone temperature to actually fall below
trip->temperature - trip->hysteresis for the cooling device to go off.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-29 15:37:56 +01:00
Rafael J. Wysocki
f2675e588f thermal: gov_fair_share: Fix dependency on trip points ordering
The computation in the fair share governor's get_trip_level() function
currently works under the assumption that the temperature ordering of
trips[] in a thermal zone is ascending, which need not be the case.

However, get_trip_level() can be made work regardless of whether or not
the trips table is ordered by temperature in any way, so change it
accordingly.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-29 15:37:56 +01:00
Srinivas Pandruvada
c6a783be82 thermal: intel: powerclamp: Remove dead code for target mwait value
After conversion of this driver to use powercap idle_inject core, this
driver doesn't use target_mwait value. So remove dead code.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-22 11:59:22 +01:00
Rob Herring
ed7dafcc53 thermal: loongson2: Replace of_device.h with explicit includes
The DT of_device.h and of_platform.h date back to the separate
of_platform_bus_type before it as merged into the regular platform bus.
As part of that merge prepping Arm DT support 13 years ago, they
"temporarily" include each other. They also include platform_device.h
and of.h.

of_device.h isn't needed, but mod_devicetable.h and property.h were
implicitly included.

Signed-off-by: Rob Herring <robh@kernel.org>
2024-01-19 08:08:53 -06:00
Linus Torvalds
d8e6ba025f More thermal control updates for 6.8-rc1
- Add debugfs-based diagnostics support to the thermal core (Daniel
     Lezcano, Dan Carpenter).
 
   - Fix a power allocator thermal governor issue preventing it from
     resetting cooling devices sometimes (Di Shen).
 
   - Simplify the thermal netlink API and clean up related code (Rafael J.
     Wysocki).
 
   - Make the Intel HFI driver support hibernation and deep suspend
     properly (Ricardo Neri).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmWmb3cSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRx1ywP+QGUpQvZaMwJ9YYA1nyUa6UbHSEwQNPc
 xscY0gRS4ovWh9EPx7HpMFtvj33jPeZONFsLAFL7AwODY4VKwt8KSinTosmrW5QC
 izesNWFwSsI2uRYsKCvigVwU6Frl4TsCG68TwbLeZl97tX5Sdaimv7TiAeU+VxJ6
 3IQ+E2d8N+ybHCK59FI8pd1EA0ZrvGFuSN1ogFAUEDfKu74A3NUtPuXWKTlnHnEy
 XfcX9axCclQmVSaYHcDeX+y0/s2bZAnSpYeoGbc2fRAXKFgVeu7tsxtj1ei/c/sO
 sttHTYjVtvBgXlZRL/jMsKK0VDXqJOOnqpEH/z7q9dn15vQTzQv2Iq6krkVlnXtr
 le2LmOxJXtU7VEf4cm0dpmbgl8IFWuZ/47Vb/duEIhmWs5piSmRj7/GIX1DGdABd
 Es7GERvA3nYZBI/5bxw49o+4/23+62lAWN+AlmTfWSJqW64cP0osbUHUF9dN5otA
 82pERngkva3RLs3Fyh5cW4oO/pcTgHsYX/2li6WwC0GKm16mjIHer2li9VmDamyH
 rk0cILxBofN99wvTq0j1HfPOdun6yE5vi+n+YQkB5Ry38r6Osyi91dZsMk5xPQo4
 R7ISm23ExpD3bmb6DahsAC/yLlrHnHYpDjWBSbHpaU8dUiReh96AJHdM6nINVoc6
 QfgjzOwtbcaM
 =TPzj
 -----END PGP SIGNATURE-----

Merge tag 'thermal-6.8-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull more thermal control updates from Rafael Wysocki:
 "These add support for debugfs-based diagnostics to the thermal core,
  simplify the thermal netlink API, fix system-wide PM support in the
  Intel HFI driver and clean up some code.

  Specifics:

   - Add debugfs-based diagnostics support to the thermal core (Daniel
     Lezcano, Dan Carpenter)

   - Fix a power allocator thermal governor issue preventing it from
     resetting cooling devices sometimes (Di Shen)

   - Simplify the thermal netlink API and clean up related code (Rafael
     J. Wysocki)

   - Make the Intel HFI driver support hibernation and deep suspend
     properly (Ricardo Neri)"

* tag 'thermal-6.8-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  thermal/debugfs: Unlock on error path in thermal_debug_tz_trip_up()
  thermal: intel: hfi: Add syscore callbacks for system-wide PM
  thermal: gov_power_allocator: avoid inability to reset a cdev
  thermal: helpers: Rearrange thermal_cdev_set_cur_state()
  thermal: netlink: Rework notify API for cooling devices
  thermal: core: Use kstrdup_const() during cooling device registration
  thermal/debugfs: Add thermal debugfs information for mitigation episodes
  thermal/debugfs: Add thermal cooling device debugfs information
  thermal: netlink: Pass thermal zone pointer to notify routines
  thermal: netlink: Drop thermal_notify_tz_trip_add/delete()
  thermal: netlink: Pass pointers to thermal_notify_tz_trip_up/down()
  thermal: netlink: Pass pointers to thermal_notify_tz_trip_change()
2024-01-17 14:47:33 -08:00
Rafael J. Wysocki
dd75558b2d Merge branches 'thermal-core' and 'thermal-intel'
Merge additional updates for 6.8-rc1 in the thermal core and in the
Intel HFI thermal driver:

 - Add debugfs-based diagnostics support to the thermal core (Daniel
   Lezcano, Dan Carpenter).

 - Fix a power allocator thermal governor issue preventing it from
   resetting cooling devices sometimes (Di Shen).

 - Simplify the thermal netlink API and clean up related code (Rafael J.
   Wysocki).

 - Make the Intel HFI driver support hibernation and deep suspend
   properly (Ricardo Neri).

* thermal-core:
  thermal/debugfs: Unlock on error path in thermal_debug_tz_trip_up()
  thermal: gov_power_allocator: avoid inability to reset a cdev
  thermal: helpers: Rearrange thermal_cdev_set_cur_state()
  thermal: netlink: Rework notify API for cooling devices
  thermal: core: Use kstrdup_const() during cooling device registration
  thermal/debugfs: Add thermal debugfs information for mitigation episodes
  thermal/debugfs: Add thermal cooling device debugfs information
  thermal: netlink: Pass thermal zone pointer to notify routines
  thermal: netlink: Drop thermal_notify_tz_trip_add/delete()
  thermal: netlink: Pass pointers to thermal_notify_tz_trip_up/down()
  thermal: netlink: Pass pointers to thermal_notify_tz_trip_change()

* thermal-intel:
  thermal: intel: hfi: Add syscore callbacks for system-wide PM
2024-01-16 12:33:15 +01:00
Dan Carpenter
6dcb35088e thermal/debugfs: Unlock on error path in thermal_debug_tz_trip_up()
Add a missing mutex_unlock(&thermal_dbg->lock) to this error path.

Fixes: 7ef01f228c ("thermal/debugfs: Add thermal debugfs information for mitigation episodes")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-12 15:53:54 +01:00
Ricardo Neri
97566d09fd thermal: intel: hfi: Add syscore callbacks for system-wide PM
The kernel allocates a memory buffer and provides its location to the
hardware, which uses it to update the HFI table. This allocation occurs
during boot and remains constant throughout runtime.

When resuming from hibernation, the restore kernel allocates a second
memory buffer and reprograms the HFI hardware with the new location as
part of a normal boot. The location of the second memory buffer may
differ from the one allocated by the image kernel.

When the restore kernel transfers control to the image kernel, its HFI
buffer becomes invalid, potentially leading to memory corruption if the
hardware writes to it (the hardware continues to use the buffer from the
restore kernel).

It is also possible that the hardware "forgets" the address of the memory
buffer when resuming from "deep" suspend. Memory corruption may also occur
in such a scenario.

To prevent the described memory corruption, disable HFI when preparing to
suspend or hibernate. Enable it when resuming.

Add syscore callbacks to handle the package of the boot CPU (packages of
non-boot CPUs are handled via CPU offline). Syscore ops always run on the
boot CPU. Additionally, HFI only needs to be disabled during "deep" suspend
and hibernation. Syscore ops only run in these cases.

Cc: 6.1+ <stable@vger.kernel.org> # 6.1+
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
[ rjw: Comment adjustment, subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-12 15:44:42 +01:00
Di Shen
e95fa74047 thermal: gov_power_allocator: avoid inability to reset a cdev
Commit 0952177f2a ("thermal/core/power_allocator: Update once
cooling devices when temp is low") adds an update flag to avoid
triggering a thermal event when there is no need, and the thermal
cdev is updated once when the temperature is low.

But when the trips are writable, and switch_on_temp is set to be a
higher value, the cooling device state may not be reset to 0,
because last_temperature is smaller than switch_on_temp.

For example:
First:
switch_on_temp=70 control_temp=85;
Then userspace change the trip_temp:
switch_on_temp=45 control_temp=55 cur_temp=54

Then userspace reset the trip_temp:
switch_on_temp=70 control_temp=85 cur_temp=57 last_temp=54

At this time, the cooling device state should be reset to 0.
However, because cur_temp(57) < switch_on_temp(70)
last_temp(54) < switch_on_temp(70)  ---->  update = false,
update is false, the cooling device state can not be reset.

Using the observation that tz->passive can also be regarded as the
temperature status, set the update flag to the tz->passive value.

When the temperature drops below switch_on for the first time, the
states of cooling devices can be reset once, and tz->passive is updated
to 0. In the next round, because tz->passive is 0, cdev->state will not
be updated.

By using the tz->passive value as the "update" flag, the issue above
can be solved, and the cooling devices can be updated only once when the
temperature is low.

Fixes: 0952177f2a ("thermal/core/power_allocator: Update once cooling devices when temp is low")
Cc: 5.13+ <stable@vger.kernel.org> # 5.13+
Suggested-by: Wei Wang <wvw@google.com>
Signed-off-by: Di Shen <di.shen@unisoc.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-12 15:38:23 +01:00
Rafael J. Wysocki
fd881eac3a thermal: helpers: Rearrange thermal_cdev_set_cur_state()
Change the code layout in thermal_cdev_set_cur_state() so it returns
early on errors which is more consistent with what happens elsewhere.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-01-12 15:38:23 +01:00
Rafael J. Wysocki
11fde93931 thermal: netlink: Rework notify API for cooling devices
In analogy with some previous thermal netlink API changes, redefine
thermal_notify_cdev_state_update(), thermal_notify_cdev_add() and
thermal_notify_cdev_delete() to take a const cdev pointer as their
first argument and let them extract the requisite information from
there by themselves.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-01-12 15:38:23 +01:00
Christophe JAILLET
57a427c81c thermal: core: Use kstrdup_const() during cooling device registration
Some *thermal_cooling_device_register() calls pass a string literal as
the 'type' parameter, so kstrdup_const() can be used instead of
kstrdup() to avoid a memory allocation in such cases.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-12 15:38:23 +01:00
Daniel Lezcano
7ef01f228c thermal/debugfs: Add thermal debugfs information for mitigation episodes
The mitigation episodes are recorded. A mitigation episode happens
when the first trip point is crossed the way up and then the way
down. During this episode other trip points can be crossed also and
are accounted for this mitigation episode. The interesting information
is the average temperature at the trip point, the undershot and the
overshot. The standard deviation of the mitigated temperature will be
added later.

The thermal debugfs directory structure tries to stay consistent with
the sysfs one but in a very simplified way:

thermal/
 `-- thermal_zones
     |-- 0
     |   `-- mitigations
     `-- 1
         `-- mitigations

The content of the mitigations file has the following format:

,-Mitigation at 349988258us, duration=130136ms
| trip |     type | temp(°mC) | hyst(°mC) |  duration  |  avg(°mC) |  min(°mC) |  max(°mC) |
|    0 |  passive |     65000 |      2000 |     130136 |     68227 |     62500 |     75625 |
|    1 |  passive |     75000 |      2000 |     104209 |     74857 |     71666 |     77500 |
,-Mitigation at 272451637us, duration=75000ms
| trip |     type | temp(°mC) | hyst(°mC) |  duration  |  avg(°mC) |  min(°mC) |  max(°mC) |
|    0 |  passive |     65000 |      2000 |      75000 |     68561 |     62500 |     75000 |
|    1 |  passive |     75000 |      2000 |      60714 |     74820 |     70555 |     77500 |
,-Mitigation at 238184119us, duration=27316ms
| trip |     type | temp(°mC) | hyst(°mC) |  duration  |  avg(°mC) |  min(°mC) |  max(°mC) |
|    0 |  passive |     65000 |      2000 |      27316 |     73377 |     62500 |     75000 |
|    1 |  passive |     75000 |      2000 |      19468 |     75284 |     69444 |     77500 |
,-Mitigation at 39863713us, duration=136196ms
| trip |     type | temp(°mC) | hyst(°mC) |  duration  |  avg(°mC) |  min(°mC) |  max(°mC) |
|    0 |  passive |     65000 |      2000 |     136196 |     73922 |     62500 |     75000 |
|    1 |  passive |     75000 |      2000 |      91721 |     74386 |     69444 |     78125 |

More information for a better understanding of the thermal behavior
will be added after. The idea is to give detailed statistics
information about the undershots and overshots, the temperature speed,
etc... As all the information in a single file is too much, the idea
would be to create a directory named with the mitigation timestamp
where all data could be added.

Please note this code is immune against trip ordering but not against
a trip temperature change while a mitigation is happening. However,
this situation should be extremely rare, perhaps not happening and we
might question ourselves if something should be done in the core
framework for other components first.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
[ rjw: White space fixups, rebase ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-12 15:37:49 +01:00
Daniel Lezcano
755113d767 thermal/debugfs: Add thermal cooling device debugfs information
The thermal framework does not have any debug information except a
sysfs stat which is a bit controversial. This one allocates big chunks
of memory for every cooling devices with a high number of states and
could represent on some systems in production several megabytes of
memory for just a portion of it. As the sysfs is limited to a page
size, the output is not exploitable with large data array and gets
truncated.

The patch provides the same information than sysfs except the
transitions are dynamically allocated, thus they won't show more
events than the ones which actually occurred. There is no longer a
size limitation and it opens the field for more debugging information
where the debugfs is designed for, not sysfs.

The thermal debugfs directory structure tries to stay consistent with
the sysfs one but in a very simplified way:

thermal/
 -- cooling_devices
    |-- 0
    |   |-- clear
    |   |-- time_in_state_ms
    |   |-- total_trans
    |   `-- trans_table
    |-- 1
    |   |-- clear
    |   |-- time_in_state_ms
    |   |-- total_trans
    |   `-- trans_table
    |-- 2
    |   |-- clear
    |   |-- time_in_state_ms
    |   |-- total_trans
    |   `-- trans_table
    |-- 3
    |   |-- clear
    |   |-- time_in_state_ms
    |   |-- total_trans
    |   `-- trans_table
    `-- 4
        |-- clear
        |-- time_in_state_ms
        |-- total_trans
        `-- trans_table

The content of the files in the cooling devices directory is the same
as the sysfs one except for the trans_table which has the following
format:

Transition	Hits
1->0      	246
0->1      	246
2->1      	632
1->2      	632
3->2      	98
2->3      	98

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
[ rjw: White space fixups, rebase ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-12 15:34:56 +01:00
Linus Torvalds
7f73ba68cf Thermal control updates for 6.8-rc1
- Add dynamic thresholds for trip point crossing detection to prevent
    trip point crossing notifications from being sent at incorrect times
    or not at all in some cases (Rafael J. Wysocki).
 
  - Fix synchronization issues related to the resume of thermal zones
    during a system-wide resume and allow thermal zones to be resumed
    concurrently (Rafael J. Wysocki).
 
  - Modify the thermal zone unregistration to wait for the given zone to
    go away completely before returning to the caller and rework the
    sysfs interface for trip points on top of that (Rafael J. Wysocki).
 
  - Fix a possible NULL pointer dereference in thermal zone registration
    error path (Rafael J. Wysocki).
 
  - Clean up the IPA thermal governor and modify it (with the help of a
    new governor callback) to avoid allocating and freeing memory every
    time its throttling callback is invoked (Lukasz Luba).
 
  - Make the IPA thermal governor handle thermal instance weight changes
    via sysfs correctly (Lukasz Luba).
 
  - Update the thermal netlink code to avoid sending messages if there
    are no recipients (Stanislaw Gruszka).
 
  - Convert Mediatek Thermal to the json-schema (Rafał Miłecki).
 
  - Fix thermal DT bindings issue on Loongson (Binbin Zhou).
 
  - Fix returning NULL instead of -ENODEV during thermal probe on
    Loogsoon (Binbin Zhou).
 
  - Add thermal DT binding for tsens on the SM8650 platform (Neil
    Armstrong).
 
  - Add reboot on the critical trip point crossing option feature (Fabio
    Estevam).
 
  - Use DEFINE_SIMPLE_DEV_PM_OPS do define PM functions for thermal
    suspend/resume on AmLogic (Uwe Kleine-König)
 
  - Add D1/T113s THS controller support to the Sun8i thermal control
    driver (Maxim Kiselev)
 
  - Fix example in the thermal DT binding for QCom SPMI (Johan Hovold).
 
  - Fix compilation warning in the tmon utility (Florian Eckert).
 
  - Add support for interrupt-based thermal configuration on Exynos along
    with a set of related cleanups (Mateusz Majewski).
 
  - Make the Intel HFI thermal driver enable an HFI instance (eg. processor
    package) from its first online CPU and disable it when the last CPU in
    it goes offline (Ricardo Neri).
 
  - Fix a kernel-doc warning and a spello in the cpuidle_cooling thermal
    driver (Randy Dunlap).
 
  - Move the .get_temp() thermal zone callback presence check to the
    thermal zone registration code (Daniel Lezcano).
 
  - Use the for_each_trip() macro for trip points table walks in a few
    places in the thermal core (Rafael J. Wysocki).
 
  - Make all trip point updates (via sysfs as well as from the platform
    firmware) trigger trip change notifications (Rafael J. Wysocki).
 
  - Drop redundant code from the thermal core and make one function in
    it take a const pointer argument (Rafael J. Wysocki).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmWb8iUSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxKLkP/iDsuDwmhZAjbAu2iftk/8ad8Trm2VoK
 +9eZ5Eqa8lKEcJLb0RxueTnFT4ppvT/hY99HOG4FM+mCnWeH/Z32N697DhqiUg4v
 GZUpeOPzxYgsfOOTeuL5XgfrVMgBjJrJunTXmzgAd8lIhTmRbAMVmFVJ18CJO11O
 RHgqvYznYFi5cywA9/NkG2xkhFB0VDoiTuIiuMMV+pMjqF0d5ooBMkhmjvPQ5Rp9
 FjNJ7hqiTamAsDPdULAFqhIGGhKZWWFbh4+S+JPCwBW8nqvxyJpemsm20vrwctJR
 bSXWQkgkDpWEeg9yrEAOO/Uk9yGd3jiLfkvPBKbK0x/YxGZ4hOYHcbF3cOUvmPYP
 5K3ZJ61DNrzB/5S3LY54VYrWmTVRdK6Lk3HYNvfAUYFJZMZ5oMYZLCUmo4SswUdy
 UUEIY27H7L18eLhP9zCcKo4njdaVG+vXQn/rJIFOpG0k9OElzPs1X8Dp/m9pKQDR
 rDUsMXqB34NUVrIEhjAgqvwF5xHooW8gykpuJgxwBetA9w8Pls2A/mzLsDY3wgdQ
 htiANGpKTDqBQSn+HrjzYckv9/R+1tDyTJmEDNZwllA1DJfrOlpCRD2VHRpgTZEA
 Ldnq0bhyq6RQnousqxhgpYkIAoGaebs9XasRH0YtBG5gIumeWfqeVzmTcM5xdsNB
 yf6RdQy8QunS
 =QVyh
 -----END PGP SIGNATURE-----

Merge tag 'thermal-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull thermal control updates from Rafael Wysocki:
 "These add support for the D1/T113s THS controller to the sun8i driver
  and a DT-based mechanism for platforms to indicate a preference to
  reboot (instead of shutting down) on crossing a critical trip point,
  fix issues, make other improvements (in the IPA governor, the Intel
  HFI driver, the exynos driver and the thermal netlink interface among
  other places) and clean up code.

  One long-standing issue addressed here is that trip point crossing
  notifications sent to user space might be unreliable due to the
  incorrect handling of trip point hysteresis in the thermal core:
  multiple notifications might be sent for the same event or there might
  be events without any notification at all.

  Specifics:

   - Add dynamic thresholds for trip point crossing detection to prevent
     trip point crossing notifications from being sent at incorrect
     times or not at all in some cases (Rafael J. Wysocki)

   - Fix synchronization issues related to the resume of thermal zones
     during a system-wide resume and allow thermal zones to be resumed
     concurrently (Rafael J. Wysocki)

   - Modify the thermal zone unregistration to wait for the given zone
     to go away completely before returning to the caller and rework the
     sysfs interface for trip points on top of that (Rafael J. Wysocki)

   - Fix a possible NULL pointer dereference in thermal zone
     registration error path (Rafael J. Wysocki)

   - Clean up the IPA thermal governor and modify it (with the help of a
     new governor callback) to avoid allocating and freeing memory every
     time its throttling callback is invoked (Lukasz Luba)

   - Make the IPA thermal governor handle thermal instance weight
     changes via sysfs correctly (Lukasz Luba)

   - Update the thermal netlink code to avoid sending messages if there
     are no recipients (Stanislaw Gruszka)

   - Convert Mediatek Thermal to the json-schema (Rafał Miłecki)

   - Fix thermal DT bindings issue on Loongson (Binbin Zhou)

   - Fix returning NULL instead of -ENODEV during thermal probe on
     Loogsoon (Binbin Zhou)

   - Add thermal DT binding for tsens on the SM8650 platform (Neil
     Armstrong)

   - Add reboot on the critical trip point crossing option feature
     (Fabio Estevam)

   - Use DEFINE_SIMPLE_DEV_PM_OPS do define PM functions for thermal
     suspend/resume on AmLogic (Uwe Kleine-König)

   - Add D1/T113s THS controller support to the Sun8i thermal control
     driver (Maxim Kiselev)

   - Fix example in the thermal DT binding for QCom SPMI (Johan Hovold)

   - Fix compilation warning in the tmon utility (Florian Eckert)

   - Add support for interrupt-based thermal configuration on Exynos
     along with a set of related cleanups (Mateusz Majewski)

   - Make the Intel HFI thermal driver enable an HFI instance (eg.
     processor package) from its first online CPU and disable it when
     the last CPU in it goes offline (Ricardo Neri)

   - Fix a kernel-doc warning and a spello in the cpuidle_cooling
     thermal driver (Randy Dunlap)

   - Move the .get_temp() thermal zone callback presence check to the
     thermal zone registration code (Daniel Lezcano)

   - Use the for_each_trip() macro for trip points table walks in a few
     places in the thermal core (Rafael J. Wysocki)

   - Make all trip point updates (via sysfs as well as from the platform
     firmware) trigger trip change notifications (Rafael J. Wysocki)

   - Drop redundant code from the thermal core and make one function in
     it take a const pointer argument (Rafael J. Wysocki)"

* tag 'thermal-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (64 commits)
  thermal: trip: Constify thermal zone argument of thermal_zone_trip_id()
  thermal: intel: hfi: Disable an HFI instance when all its CPUs go offline
  thermal: intel: hfi: Enable an HFI instance from its first online CPU
  thermal: intel: hfi: Refactor enabling code into helper functions
  thermal/drivers/exynos: Use set_trips ops
  thermal/drivers/exynos: Use BIT wherever possible
  thermal/drivers/exynos: Split initialization of TMU and the thermal zone
  thermal/drivers/exynos: Stop using the threshold mechanism on Exynos 4210
  thermal/drivers/exynos: Simplify regulator (de)initialization
  thermal/drivers/exynos: Handle devm_regulator_get_optional return value correctly
  thermal/drivers/exynos: Wwitch from workqueue-driven interrupt handling to threaded interrupts
  thermal/drivers/exynos: Drop id field
  thermal/drivers/exynos: Remove an unnecessary field description
  tools/thermal/tmon: Fix compilation warning for wrong format
  dt-bindings: thermal: qcom-spmi-adc-tm5/hc: Clean up examples
  dt-bindings: thermal: qcom-spmi-adc-tm5/hc: Fix example node names
  thermal/drivers/sun8i: Add D1/T113s THS controller support
  dt-bindings: thermal: sun8i: Add binding for D1/T113s THS controller
  thermal: amlogic: Use DEFINE_SIMPLE_DEV_PM_OPS for PM functions
  thermal: amlogic: Make amlogic_thermal_disable() return void
  ...
2024-01-09 16:20:17 -08:00
Rafael J. Wysocki
2f521890aa thermal: netlink: Pass thermal zone pointer to notify routines
There are several rountines in the thermal netlink API that take a
thermal zone ID or a thermal zone type as their arguments, but from
their callers perspective it would be more convenient to pass a thermal
zone pointer to them and let them extract the necessary data from the
given thermal zone object by themselves.

Modify the code accordingly.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-01-09 13:32:23 +01:00
Rafael J. Wysocki
4ae535f37d thermal: netlink: Drop thermal_notify_tz_trip_add/delete()
Because thermal_notify_tz_trip_add/delete() are never used, drop them
entirely along with the related code.

The addition or removal of trip points is not supported by the thermal
core and is unlikely to be supported in the future, so it is also
unlikely that these functions will ever be needed.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-01-09 13:32:00 +01:00
Rafael J. Wysocki
f52557edf0 thermal: netlink: Pass pointers to thermal_notify_tz_trip_up/down()
Instead of requiring the callers of thermal_notify_tz_trip_up/down() to
provide specific values needed to populate struct param in them, make
them extract those values from objects passed by the callers via const
pointers.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-01-09 13:31:37 +01:00
Rafael J. Wysocki
7e72fc41d4 thermal: netlink: Pass pointers to thermal_notify_tz_trip_change()
Instead of requiring the caller of thermal_notify_tz_trip_change() to
provide specific values needed to populate struct param in it, make it
extract those values from objects passed to it by the caller via const
pointers.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-01-09 13:30:55 +01:00
Rafael J. Wysocki
17e8b76491 Merge branch 'thermal-intel'
Merge changes in thermal control drivers for Intel platforms for
6.8-rc1:

 - Make the Intel HFI thermal driver enable an HFI instance (eg. processor
   package) from its first online CPU and disable it when the last CPU in
   it goes offline (Ricardo Neri).

* thermal-intel:
  thermal: intel: hfi: Disable an HFI instance when all its CPUs go offline
  thermal: intel: hfi: Enable an HFI instance from its first online CPU
  thermal: intel: hfi: Refactor enabling code into helper functions
2024-01-05 13:34:15 +01:00
Rafael J. Wysocki
f380846462 thermal: trip: Constify thermal zone argument of thermal_zone_trip_id()
Because thermal_zone_trip_id() does not update the thermal zone object
passed to it, its pointer argument representing the thermal zone can be
const, so adjust its definition accordingly.

No functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-01-04 12:17:45 +01:00
Ricardo Neri
1c53081d77 thermal: intel: hfi: Disable an HFI instance when all its CPUs go offline
In preparation to support hibernation, add functionality to disable an HFI
instance during CPU offline. The last CPU of an instance that goes offline
will disable such instance.

The Intel Software Development Manual states that the operating system must
wait for the hardware to set MSR_IA32_PACKAGE_THERM_STATUS[26] after
disabling an HFI instance to ensure that it will no longer write on the HFI
memory. Some processors, however, do not ever set such bit. Wait a minimum
of 2ms to give time hardware to complete any pending memory writes.

Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-03 14:06:40 +01:00
Ricardo Neri
ac1f9230d9 thermal: intel: hfi: Enable an HFI instance from its first online CPU
Previously, HFI instances were never disabled once enabled. A CPU in an
instance only had to check during boot whether another CPU had previously
initialized the instance and its corresponding data structure.

A subsequent changeset will add functionality to disable instances
to support hibernation. Such change will also make possible to disable an
HFI instance during runtime via CPU hotplug.

Enable an HFI instance from the first of its CPUs that comes online. This
covers the boot, CPU hotplug, and resume-from-suspend cases. It also covers
systems with one or more HFI instances (i.e., packages).

Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-03 14:06:40 +01:00
Ricardo Neri
8a8b6bb93c thermal: intel: hfi: Refactor enabling code into helper functions
In preparation for the addition of a suspend notifier, wrap the logic to
enable HFI and program its memory buffer into helper functions. Both the
CPU hotplug callback and the suspend notifier will use them.

This refactoring does not introduce functional changes.

Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-03 14:06:40 +01:00
Rafael J. Wysocki
d654362d53 - Converted Mediatek Thermal to the json-schema (Rafał Miłecki)
- Fixed DT bindings issue on Loongson (Binbin Zhou)
 
 - Fixed returning NULL instead of -ENODEV on Loogsoo (Binbin Zhou)
 
 - Added the DT binding for the tsens on SM8650 platform (Neil Armstrong)
 
 - Added a reboot on critical option feature (Fabio Estevam)
 
 - Made usage of DEFINE_SIMPLE_DEV_PM_OPS on AmLogic (Uwe Kleine-König)
 
 - Added the D1/T113s THS controller support on Sun8i (Maxim Kiselev)
 
 - Fixed example in the DT binding for QCom SPMI (Johan Hovold)
 
 - Fixed compilation warning for the tmon utility (Florian Eckert)
 
 - Added interrupt based configuration on Exynos along with a set of
   related cleanups (Mateusz Majewski)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGn3N4YVz0WNVyHskqDIjiipP6E8FAmWT0yUACgkQqDIjiipP
 6E+YjggAsrsQNrsUgiW0/M0i75kfcEBgLfXscyPFsUYwpByb+haWzAJSLrvCBko8
 zPFvNE0or6KTQJCtseWWQDNQKWilAKARurEg7vuahfWo5LmfauPGMxsw+iHM9TgW
 8Ptkc1biy3TNr1zVCpQrCZK9GdLGsG7JRgxHi4Hfr/Hb/FZN9Mm0Yk1pRpFU+pn8
 Ff5UScMcU7NWhhxlQavLOoMmyAmh2k/jvfCSlmXGj7kxRrX8YIC02dTDjFm5cpyP
 Toy2POWkcZr4Xr+kWOh8pxojZVwKU2pN7cYyLJn8+OO+rpUAf0ol4PW0mly7uMgH
 1HCZ0x0hiNJQ8N6SwN+Eptq9TpgCBw==
 =W2kc
 -----END PGP SIGNATURE-----

Merge tag 'thermal-v6.8-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux into thermal

Merge thermal control material for 6.8-rc1 from Daniel Lezcano:

"- Converted Mediatek Thermal to the json-schema (Rafał Miłecki)

 - Fixed DT bindings issue on Loongson (Binbin Zhou)

 - Fixed returning NULL instead of -ENODEV on Loogsoo (Binbin Zhou)

 - Added the DT binding for the tsens on SM8650 platform (Neil Armstrong)

 - Added a reboot on critical option feature (Fabio Estevam)

 - Made usage of DEFINE_SIMPLE_DEV_PM_OPS on AmLogic (Uwe Kleine-König)

 - Added the D1/T113s THS controller support on Sun8i (Maxim Kiselev)

 - Fixed example in the DT binding for QCom SPMI (Johan Hovold)

 - Fixed compilation warning for the tmon utility (Florian Eckert)

 - Added interrupt based configuration on Exynos along with a set of
   related cleanups (Mateusz Majewski)"

* tag 'thermal-v6.8-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux: (24 commits)
  thermal/drivers/exynos: Use set_trips ops
  thermal/drivers/exynos: Use BIT wherever possible
  thermal/drivers/exynos: Split initialization of TMU and the thermal zone
  thermal/drivers/exynos: Stop using the threshold mechanism on Exynos 4210
  thermal/drivers/exynos: Simplify regulator (de)initialization
  thermal/drivers/exynos: Handle devm_regulator_get_optional return value correctly
  thermal/drivers/exynos: Wwitch from workqueue-driven interrupt handling to threaded interrupts
  thermal/drivers/exynos: Drop id field
  thermal/drivers/exynos: Remove an unnecessary field description
  tools/thermal/tmon: Fix compilation warning for wrong format
  dt-bindings: thermal: qcom-spmi-adc-tm5/hc: Clean up examples
  dt-bindings: thermal: qcom-spmi-adc-tm5/hc: Fix example node names
  thermal/drivers/sun8i: Add D1/T113s THS controller support
  dt-bindings: thermal: sun8i: Add binding for D1/T113s THS controller
  thermal: amlogic: Use DEFINE_SIMPLE_DEV_PM_OPS for PM functions
  thermal: amlogic: Make amlogic_thermal_disable() return void
  thermal/thermal_of: Allow rebooting after critical temp
  reboot: Introduce thermal_zone_device_critical_reboot()
  thermal/core: Prepare for introduction of thermal reboot
  dt-bindings: thermal-zones: Document critical-action
  ...
2024-01-02 13:45:36 +01:00
Mateusz Majewski
5314b15437 thermal/drivers/exynos: Use set_trips ops
Currently, each trip point defined in the device tree corresponds to a
single hardware interrupt. This commit instead switches to using two
hardware interrupts, whose values are set dynamically using the
set_trips callback. Additionally, the critical temperature threshold is
handled specifically.

Setting interrupts in this way also fixes a long-standing lockdep
warning, which was caused by calling thermal_zone_get_trips with our
lock being held. Do note that this requires TMU initialization to be
split into two parts, as done by the parent commit: parts of the
initialization call into the thermal_zone_device structure and so must
be done after its registration, but the initialization is also
responsible for setting up calibration, which must be done before
thermal_zone_device registration, which will call set_trips for the
first time; if the calibration is not done in time, the interrupt values
will be silently wrong!

Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231201095625.301884-10-m.majewski2@samsung.com
2024-01-02 09:33:19 +01:00
Mateusz Majewski
af00d48833 thermal/drivers/exynos: Use BIT wherever possible
The original driver did not use that macro and it allows us to make our
intentions slightly clearer.

Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231201095625.301884-9-m.majewski2@samsung.com
2024-01-02 09:33:19 +01:00
Mateusz Majewski
b72ba67bae thermal/drivers/exynos: Split initialization of TMU and the thermal zone
This will be needed in the future, as the thermal zone subsystem might
call our callbacks right after devm_thermal_of_zone_register. Currently
we just make get_temp return EAGAIN in such case, but this will not be
possible with state-modifying callbacks, for instance set_trips.

Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231201095625.301884-8-m.majewski2@samsung.com
2024-01-02 09:33:19 +01:00
Mateusz Majewski
d7a5b43191 thermal/drivers/exynos: Stop using the threshold mechanism on Exynos 4210
Exynos 4210 supports setting a base threshold value, which is added to
all trip points. This might be useful, but is not really necessary in
our usecase, so we always set it to 0 to simplify the code a bit.

Additionally, this change makes it so that we convert the value to the
calibrated one in a slightly different place. This is more correct
morally, though it does not make any change when single-point
calibration is being used (which is the case currently).

Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231201095625.301884-7-m.majewski2@samsung.com
2024-01-02 09:33:19 +01:00
Mateusz Majewski
5d6976d014 thermal/drivers/exynos: Simplify regulator (de)initialization
We rewrite the initialization to enable the regulator as part of devm,
which allows us to not handle the struct instance manually.

Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link:
https://lore.kernel.org/r/20231201095625.301884-6-m.majewski2@samsung.com
2024-01-02 09:33:19 +01:00
Mateusz Majewski
52ef6f567e thermal/drivers/exynos: Handle devm_regulator_get_optional return value correctly
Currently, if regulator is required in the SoC, but
devm_regulator_get_optional fails for whatever reason, the execution
will proceed without propagating the error. Meanwhile there is no
reason to output the error in case of -ENODEV.

Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231201095625.301884-5-m.majewski2@samsung.com
2024-01-02 09:33:19 +01:00
Mateusz Majewski
20009a8137 thermal/drivers/exynos: Wwitch from workqueue-driven interrupt handling to threaded interrupts
The workqueue boilerplate is mostly one-to-one what the threaded
interrupts do.

Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231201095625.301884-4-m.majewski2@samsung.com
2024-01-02 09:33:19 +01:00
Mateusz Majewski
0ac3e1cf37 thermal/drivers/exynos: Drop id field
We do not use the value, and only Exynos 7 defines this alias anyway.

Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231201095625.301884-3-m.majewski2@samsung.com
2024-01-02 09:33:19 +01:00
Mateusz Majewski
0cefaf6c89 thermal/drivers/exynos: Remove an unnecessary field description
It seems that the field has been removed in one of the previous commits,
but the description has been forgotten.

Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Mateusz Majewski <m.majewski2@samsung.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231201095625.301884-2-m.majewski2@samsung.com
2024-01-02 09:33:19 +01:00
Maxim Kiselev
ebbf19e36d thermal/drivers/sun8i: Add D1/T113s THS controller support
This patch adds a thermal sensor controller support for the D1/T113s,
which is similar to the one on H6, but with only one sensor and
different scale and offset values.

Signed-off-by: Maxim Kiselev <bigunclemax@gmail.com>
Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231217210629.131486-3-bigunclemax@gmail.com
2024-01-02 09:33:18 +01:00
Uwe Kleine-König
ac99b12963 thermal: amlogic: Use DEFINE_SIMPLE_DEV_PM_OPS for PM functions
This macro has the advantage over SIMPLE_DEV_PM_OPS that we don't have to
care about when the functions are actually used, so the corresponding
__maybe_unused can be dropped.

Also make use of pm_ptr() to discard all PM related stuff if CONFIG_PM
isn't enabled.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231116112633.668826-3-u.kleine-koenig@pengutronix.de
2024-01-02 09:33:18 +01:00
Uwe Kleine-König
720f8db834 thermal: amlogic: Make amlogic_thermal_disable() return void
amlogic_thermal_disable() returned zero unconditionally and
amlogic_thermal_remove() already ignores the return value.

Make it return no value and modify amlogic_thermal_suspend to not check
the value.

This patch introduces no semantic changes, but makes it more obvious for
a human reader that amlogic_thermal_suspend() cannot fail.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231116112633.668826-2-u.kleine-koenig@pengutronix.de
2024-01-02 09:33:18 +01:00
Fabio Estevam
62e79e38b2 thermal/thermal_of: Allow rebooting after critical temp
Currently, the default mechanism is to trigger a shutdown after the
critical temperature is reached.

In some embedded cases, such behavior does not suit well, as the board may
be unattended in the field and rebooting may be a better approach.

The bootloader may also check the temperature and only allow the boot to
proceed when the temperature is below a certain threshold.

Introduce support for allowing a reboot to be triggered after the
critical temperature is reached.

If the "critical-action" devicetree property is not found, fall back to
the shutdown action to preserve the existing default behavior.

If a custom ops->critical exists, then it takes preference over
critical-actions.

Tested on a i.MX8MM board with the following devicetree changes:

	thermal-zones {
		cpu-thermal {
			critical-action = "reboot";
		};
	};

Signed-off-by: Fabio Estevam <festevam@denx.de>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231129124330.519423-4-festevam@gmail.com
2024-01-02 09:33:18 +01:00
Fabio Estevam
79fa723ba8 reboot: Introduce thermal_zone_device_critical_reboot()
Introduce thermal_zone_device_critical_reboot() to trigger an
emergency reboot.

It is a counterpart of thermal_zone_device_critical() with the
difference that it will force a reboot instead of shutdown.

The motivation for doing this is to allow the thermal subystem
to trigger a reboot when the temperature reaches the critical
temperature.

Signed-off-by: Fabio Estevam <festevam@denx.de>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231129124330.519423-3-festevam@gmail.com
2024-01-02 09:33:18 +01:00
Fabio Estevam
5a0e241003 thermal/core: Prepare for introduction of thermal reboot
Add some helper functions to make it easier introducing the support
for thermal reboot.

No functional change.

Signed-off-by: Fabio Estevam <festevam@denx.de>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231129124330.519423-2-festevam@gmail.com
2024-01-02 09:33:18 +01:00
Binbin Zhou
15ef92e9c4 drivers/thermal/loongson2_thermal: Fix incorrect PTR_ERR() judgment
PTR_ERR() returns -ENODEV when thermal-zones are undefined, and we need
-ENODEV as the right value for comparison.

Otherwise, tz->type is NULL when thermal-zones is undefined, resulting
in the following error:

[   12.290030] CPU 1 Unable to handle kernel paging request at virtual address fffffffffffffff1, era == 900000000355f410, ra == 90000000031579b8
[   12.302877] Oops[#1]:
[   12.305190] CPU: 1 PID: 181 Comm: systemd-udevd Not tainted 6.6.0-rc7+ #5385
[   12.312304] pc 900000000355f410 ra 90000000031579b8 tp 90000001069e8000 sp 90000001069eba10
[   12.320739] a0 0000000000000000 a1 fffffffffffffff1 a2 0000000000000014 a3 0000000000000001
[   12.329173] a4 90000001069eb990 a5 0000000000000001 a6 0000000000001001 a7 900000010003431c
[   12.337606] t0 fffffffffffffff1 t1 54567fd5da9b4fd4 t2 900000010614ec40 t3 00000000000dc901
[   12.346041] t4 0000000000000000 t5 0000000000000004 t6 900000010614ee20 t7 900000000d00b790
[   12.354472] t8 00000000000dc901 u0 54567fd5da9b4fd4 s9 900000000402ae10 s0 900000010614ec40
[   12.362916] s1 90000000039fced0 s2 ffffffffffffffed s3 ffffffffffffffed s4 9000000003acc000
[   12.362931] s5 0000000000000004 s6 fffffffffffff000 s7 0000000000000490 s8 90000001028b2ec8
[   12.362938]    ra: 90000000031579b8 thermal_add_hwmon_sysfs+0x258/0x300
[   12.386411]   ERA: 900000000355f410 strscpy+0xf0/0x160
[   12.391626]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   12.397898]  PRMD: 00000004 (PPLV0 +PIE -PWE)
[   12.403678]  EUEN: 00000000 (-FPE -SXE -ASXE -BTE)
[   12.409859]  ECFG: 00071c1c (LIE=2-4,10-12 VS=7)
[   12.415882] ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0)
[   12.415907]  BADV: fffffffffffffff1
[   12.415911]  PRID: 0014a000 (Loongson-64bit, Loongson-2K1000)
[   12.415917] Modules linked in: loongson2_thermal(+) vfat fat uio_pdrv_genirq uio fuse zram zsmalloc
[   12.415950] Process systemd-udevd (pid: 181, threadinfo=00000000358b9718, task=00000000ace72fe3)
[   12.415961] Stack : 0000000000000dc0 54567fd5da9b4fd4 900000000402ae10 9000000002df9358
[   12.415982]         ffffffffffffffed 0000000000000004 9000000107a10aa8 90000001002a3410
[   12.415999]         ffffffffffffffed ffffffffffffffed 9000000107a11268 9000000003157ab0
[   12.416016]         9000000107a10aa8 ffffff80020fc0c8 90000001002a3410 ffffffffffffffed
[   12.416032]         0000000000000024 ffffff80020cc1e8 900000000402b2a0 9000000003acc000
[   12.416048]         90000001002a3410 0000000000000000 ffffff80020f4030 90000001002a3410
[   12.416065]         0000000000000000 9000000002df6808 90000001002a3410 0000000000000000
[   12.416081]         ffffff80020f4030 0000000000000000 90000001002a3410 9000000002df2ba8
[   12.416097]         00000000000000b4 90000001002a34f4 90000001002a3410 0000000000000002
[   12.416114]         ffffff80020f4030 fffffffffffffff0 90000001002a3410 9000000002df2f30
[   12.416131]         ...
[   12.416138] Call Trace:
[   12.416142] [<900000000355f410>] strscpy+0xf0/0x160
[   12.416167] [<90000000031579b8>] thermal_add_hwmon_sysfs+0x258/0x300
[   12.416183] [<9000000003157ab0>] devm_thermal_add_hwmon_sysfs+0x50/0xe0
[   12.416200] [<ffffff80020cc1e8>] loongson2_thermal_probe+0x128/0x200 [loongson2_thermal]
[   12.416232] [<9000000002df6808>] platform_probe+0x68/0x140
[   12.416249] [<9000000002df2ba8>] really_probe+0xc8/0x3c0
[   12.416269] [<9000000002df2f30>] __driver_probe_device+0x90/0x180
[   12.416286] [<9000000002df3058>] driver_probe_device+0x38/0x160
[   12.416302] [<9000000002df33a8>] __driver_attach+0xa8/0x200
[   12.416314] [<9000000002deffec>] bus_for_each_dev+0x8c/0x120
[   12.416330] [<9000000002df198c>] bus_add_driver+0x10c/0x2a0
[   12.416346] [<9000000002df46b4>] driver_register+0x74/0x160
[   12.416358] [<90000000022201a4>] do_one_initcall+0x84/0x220
[   12.416372] [<90000000022f3ab8>] do_init_module+0x58/0x2c0
[   12.416386] [<90000000022f6538>] init_module_from_file+0x98/0x100
[   12.416399] [<90000000022f67f0>] sys_finit_module+0x230/0x3c0
[   12.416412] [<900000000358f7c8>] do_syscall+0x88/0xc0
[   12.416431] [<900000000222137c>] handle_syscall+0xbc/0x158

Fixes: e7e3a7c357 ("thermal/drivers/loongson-2: Add thermal management support")
Cc: Yinbo Zhu <zhuyinbo@loongson.cn>
Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/343c14de98216636a47b43e8bfd47b70d0a8e068.1700817227.git.zhoubinbin@loongson.cn
2024-01-02 09:33:18 +01:00
Lukasz Luba
a3cd6db4cc thermal: gov_power_allocator: Support new update callback of weights
When the thermal instance's weight is updated from the sysfs the governor
update_tz() callback is triggered. Implement proper reaction to this
event in the IPA, which would save CPU cycles spent in throttle().
This will speed-up the main throttle() IPA function and clean it up
a bit.

Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-29 18:01:01 +01:00
Lukasz Luba
bfc57bd168 thermal/sysfs: Update governors when the 'weight' has changed
Support governors update when the thermal instance's weight has changed.
This allows to adjust internal state for the governor.

Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
[ rjw: Add two empty code lines aroung the locking ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-29 18:01:01 +01:00
Lukasz Luba
879c9dc511 thermal/sysfs: Update instance->weight under tz lock
User space can change the weight of a thermal instance via sysfs while the
.throttle() callback is running for a governor, because weight_store()
does not use the zone lock.

The IPA governor uses instance weight values for power calculations and
caches the sum of them as total_weight, so it gets confused when one of
them changes while its .throttle() callback is running.

To prevent that from happening, use thermal zone locking in
weight_store().

Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-29 18:01:00 +01:00
Lukasz Luba
e3ecd5716b thermal: gov_power_allocator: Simplify checks for valid power actor
There is a need to check if the cooling device in the thermal zone
supports IPA callback and is set for control trip point.
Refactor the code which validates the power actor capabilities and
make it more consistent in all places.

No intentional functional impact.

Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-29 18:01:00 +01:00
Lukasz Luba
912e97c67c thermal: gov_power_allocator: Move memory allocation out of throttle()
The new thermal callback allows to react to the change of cooling
instances in the thermal zone. Move the memory allocation to that new
callback and save CPU cycles in the throttle() code path.

Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-29 18:01:00 +01:00
Lukasz Luba
792c3dc08d thermal: gov_power_allocator: Change trace functions
Change trace event trace_thermal_power_allocator() to not use dynamic
array for requested power and granted power for all power actors.
Instead, simplify the trace event and print other simple values.

Add new trace event to print power actor information of requested power
and granted power. That trace event would be called in a loop for each
power actor. The trace data would be easier to parse comparing to the
dynamic array implementation.

Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-29 18:01:00 +01:00
Lukasz Luba
3d827317b1 thermal: gov_power_allocator: Refactor checks in divvy_up_power()
Simplify the code and remove one extra 'if' block.

No intentional functional impact.

Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-29 18:01:00 +01:00
Lukasz Luba
2c06456f65 thermal: gov_power_allocator: Refactor check_power_actors()
In preparation for a subsequent change, rearrange check_power_actors().

No intentional functional impact.

Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-29 18:01:00 +01:00
Lukasz Luba
a8c959402d thermal: core: Add governor callback for thermal zone change
Add a new callback to the struct thermal_governor. It can be used for
updating governors when there is a change in the thermal zone internals,
e.g. thermal cooling device is bind to the thermal zone.

That makes possible to move some heavy operations like memory allocations
related to the number of cooling instances out of the throttle() callback.

Both callback code paths (throttle() and update_tz()) are protected with
the same thermal zone lock, which guaranties the consistency.

Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-29 18:01:00 +01:00
Stanislaw Gruszka
04c3b03044 thermal: netlink: Add thermal_group_has_listeners() helper
Add a helper function to check if there are listeners for
thermal_gnl_family multicast groups.

For now use it to avoid unnecessary allocations and sending
thermal genl messages when there are no recipients.

In the future, in conjunction with (not yet implemented) notification
of change in the netlink socket group membership, this helper can be
used to open/close hardware interfaces based on the presence of
user space subscribers.

Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-28 19:56:40 +01:00
Stanislaw Gruszka
5eb4f413ad thermal: netlink: Add enum for mutlicast groups indexes
Use enum instead of hard-coded numbers for indexing multicast groups.

Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-28 19:56:39 +01:00
Rafael J. Wysocki
5a5efdaffd thermal: core: Resume thermal zones asynchronously
The resume of thermal zones in thermal_pm_notify() is carried out
sequentially, which may be a problem if __thermal_zone_device_update()
takes a significant time to run for some thermal zones, because some
other thermal zones may need to wait for them to resume then and if
any other PM notifiers are going to be invoked after the thermal one,
they will need to wait for it either.

To address this, make thermal_pm_notify() switch the poll_queue delayed
work over to a one-shot thermal_zone_device_resume() work function that
will restore the original one during the thermal zone resume and queue
up poll_queue without a delay for each thermal zone.

Link: https://lore.kernel.org/linux-pm/20231120234015.3273143-1-radusolea@google.com/
Reported-by: Radu Solea <radusolea@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-28 14:20:16 +01:00
Rafael J. Wysocki
33fcb595dc thermal: core: Initialize poll_queue in thermal_zone_device_init()
In preparation for a subsequent change, move the initialization of the
poll_queue delayed work from thermal_zone_device_register_with_trips()
to thermal_zone_device_init() which is called by the former.

However, because thermal_zone_device_init() is also called by
thermal_pm_notify(), make the latter call cancel_delayed_work() on
poll_queue before invoking the former, so as to allow the work
item to be re-initialized safely.

Also move thermal_zone_device_check() which needs to be defined
before thermal_zone_device_init(), so the latter can pass it to the
INIT_DELAYED_WORK() macro.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-28 14:20:16 +01:00
Rafael J. Wysocki
4e814173a8 thermal: core: Fix thermal zone suspend-resume synchronization
There are 3 synchronization issues with thermal zone suspend-resume
during system-wide transitions:

 1. The resume code runs in a PM notifier which is invoked after user
    space has been thawed, so it can run concurrently with user space
    which can trigger a thermal zone device removal.  If that happens,
    the thermal zone resume code may use a stale pointer to the next
    list element and crash, because it does not hold thermal_list_lock
    while walking thermal_tz_list.

 2. The thermal zone resume code calls thermal_zone_device_init()
    outside the zone lock, so user space or an update triggered by
    the platform firmware may see an inconsistent state of a
    thermal zone leading to unexpected behavior.

 3. Clearing the in_suspend global variable in thermal_pm_notify()
    allows __thermal_zone_device_update() to continue for all thermal
    zones and it may as well run before the thermal_tz_list walk (or
    at any point during the list walk for that matter) and attempt to
    operate on a thermal zone that has not been resumed yet.  It may
    also race destructively with thermal_zone_device_init().

To address these issues, add thermal_list_lock locking to
thermal_pm_notify(), especially arount the thermal_tz_list,
make it call thermal_zone_device_init() back-to-back with
__thermal_zone_device_update() under the zone lock and replace
in_suspend with per-zone bool "suspend" indicators set and unset
under the given zone's lock.

Link: https://lore.kernel.org/linux-pm/20231218162348.69101-1-bo.ye@mediatek.com/
Reported-by: Bo Ye <bo.ye@mediatek.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-28 14:20:15 +01:00
Randy Dunlap
5f70413a85 thermal: cpuidle_cooling: fix kernel-doc warning and a spello
Correct one misuse of kernel-doc notation and one spelling error as
reported by codespell.

cpuidle_cooling.c:152: warning: cannot understand function prototype: 'struct thermal_cooling_device_ops cpuidle_cooling_ops = '

For the kernel-doc warning, don't use "/**" for a comment on data.
kernel-doc can be used for structure declarations but not definitions.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-21 12:05:48 +01:00
Rafael J. Wysocki
04e6ccfc93 thermal: core: Fix NULL pointer dereference in zone registration error path
If device_register() in thermal_zone_device_register_with_trips()
returns an error, the tz variable is set to NULL and subsequently
dereferenced in kfree(tz->tzp).

Commit adc8749b15 ("thermal/drivers/core: Use put_device() if
device_register() fails") added the tz = NULL assignment in question to
avoid a possible double-free after dropping the reference to the zone
device.  However, after commit 4649620d94 ("thermal: core: Make
thermal_zone_device_unregister() return after freeing the zone"), that
assignment has become redundant, because dropping the reference to the
zone device does not cause the zone object to be freed any more.

Drop it to address the NULL pointer dereference.

Fixes: 3d439b1a2a ("thermal/core: Alloc-copy-free the thermal zone parameters structure")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2023-12-15 18:24:24 +01:00
Daniel Lezcano
404f62cd64 thermal/core: Check get_temp ops is present when registering a tz
Initially the check against the get_temp ops in the
thermal_zone_device_update() was put in there in order to catch
drivers not providing this method.

Instead of checking again and again the function if the ops exists in
the update function, let's do the check at registration time, so it is
checked one time and for all.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-13 14:35:32 +01:00
Rafael J. Wysocki
bdc22c8d52 thermal: trip: Send trip change notifications on all trip updates
The _store callbacks of the trip point temperature and hysteresis sysfs
attributes invoke thermal_notify_tz_trip_change() to send a notification
regarding the trip point change, but when trip points are updated by the
platform firmware, trip point change notifications are not sent.

To make the behavior after a trip point change more consistent,
modify all of the 3 places where trip point temperature is updated
to use a new function called thermal_zone_set_trip_temp() for this
purpose and make that function call thermal_notify_tz_trip_change().

Note that trip point hysteresis can only be updated via sysfs and
trip_point_hyst_store() calls thermal_notify_tz_trip_change() already,
so this code path need not be changed.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2023-12-13 12:37:01 +01:00
Rafael J. Wysocki
183b64132f thermal: netlink: Use for_each_trip() in thermal_genl_cmd_tz_get_trip()
Make thermal_genl_cmd_tz_get_trip() use for_each_trip() instead of an open-
coded loop over trip indices.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2023-12-13 12:33:43 +01:00
Rafael J. Wysocki
2e3e7dad4b thermal: helpers: Use for_each_trip() in __thermal_zone_get_temp()
Make __thermal_zone_get_temp() use for_each_trip() instead of an open-
coded loop over trip indices.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2023-12-13 12:33:43 +01:00
Rafael J. Wysocki
0c0c4740c9 thermal: trip: Use for_each_trip() in __thermal_zone_set_trips()
Make __thermal_zone_set_trips() use for_each_trip() instead of an open-
coded loop over trip indices.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2023-12-13 12:33:43 +01:00
Rafael J. Wysocki
b6515a88ba thermal: trip: Drop redundant __thermal_zone_get_trip() header
The __thermal_zone_get_trip() header in drivers/thermal/thermal_core.h
is redundant, because there is one already in thermal.h, so drop it.

No functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2023-12-13 12:31:59 +01:00
Rafael J. Wysocki
b38aa87f67 thermal: core: Rework thermal zone availability check
In order to avoid running __thermal_zone_device_update() for thermal
zones going away, the thermal zone lock is held around device_del()
in thermal_zone_device_unregister() and thermal_zone_device_update()
passes the given thermal zone device to device_is_registered().
This allows thermal_zone_device_update() to skip the
__thermal_zone_device_update() if device_del() has already run for
the thermal zone at hand.

However, instead of looking at driver core internals, the thermal
subsystem may as well rely on its own data structures for this
purpose.  Namely, if the thermal zone is not present in
thermal_tz_list, it can be regarded as unavailable, which in fact is
already the case in thermal_zone_device_unregister().  Accordingly,
the device_is_registered() check in thermal_zone_device_update() can
be replaced with checking whether or not the node list_head in struct
thermal_zone_device is empty, in which case it is not there in
thermal_tz_list.

To make this work, though, it is necessary to initialize tz->node
in thermal_zone_device_register_with_trips() before registering the
thermal zone device and it needs to be added to thermal_tz_list and
deleted from it under its zone lock.

After the above modifications, the zone lock does not need to be
held around device_del() in thermal_zone_device_unregister() any more.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-and-tested-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2023-12-12 13:01:03 +01:00
Rafael J. Wysocki
c3ffdfff97 thermal: Drop redundant and confusing device_is_registered() checks
Multiple places in the thermal subsystem (most importantly, sysfs
attribute callback functions) check if the given thermal zone device is
still registered in order to return early in case the device_del() in
thermal_zone_device_unregister() has run already.

However, after thermal_zone_device_unregister() has been made wait for
all of the zone-related activity to complete before returning, it is
not necessary to do that any more, because all of the code holding a
reference to the thermal zone device object will be waited for even if
it does not do anything special to enforce this.

Accordingly, drop all of the device_is_registered() checks that are now
redundant and get rid of the zone locking that is not necessary any more
after dropping them.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-and-tested-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2023-12-12 13:00:28 +01:00
Rafael J. Wysocki
4649620d94 thermal: core: Make thermal_zone_device_unregister() return after freeing the zone
Make thermal_zone_device_unregister() wait until all of the references
to the given thermal zone object have been dropped and free it before
returning.

This guarantees that when thermal_zone_device_unregister() returns,
there is no leftover activity regarding the thermal zone in question
which is required by some of its callers (for instance, modular driver
code that wants to know when it is safe to let the module go away).

Subsequently, this will allow some confusing device_is_registered()
checks to be dropped from the thermal sysfs and core code.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-and-tested-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2023-12-11 20:49:53 +01:00
Rafael J. Wysocki
18dfb0e4c3 thermal: sysfs: Rework the reading of trip point attributes
Rework the _show() callback functions for the trip point temperature,
hysteresis and type attributes to avoid copying the values of struct
thermal_trip fields that they do not use and make them carry out the
same validation checks as the corresponding _store() callback functions.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2023-12-06 21:29:52 +01:00
Rafael J. Wysocki
be0a3600aa thermal: sysfs: Rework the handling of trip point updates
Both trip_point_temp_store() and trip_point_hyst_store() use
thermal_zone_set_trip() to update a given trip point, but none of them
actually needs to change more than one field in struct thermal_trip
representing it.  However, each of them effectively calls
__thermal_zone_get_trip() twice in a row for the same trip index value,
once directly and once via thermal_zone_set_trip(), which is not
particularly efficient, and the way in which thermal_zone_set_trip()
carries out the update is not particularly straightforward.

Moreover, input processing need not be done under the thermal zone lock
in any of these functions.

Rework trip_point_temp_store() and trip_point_hyst_store() to address
the above, move the part of thermal_zone_set_trip() that is still
useful to a new function called thermal_zone_trip_updated() and drop
the rest of it.

While at it, make trip_point_hyst_store() reject negative hysteresis
values.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2023-12-06 21:29:52 +01:00
Rafael J. Wysocki
5973024164 thermal: trip: Drop a redundant check from thermal_zone_set_trip()
After recent changes in the thermal framework, a trip points array is
required for registering a thermal zone that is not tripless, so the
tz->trips pointer in thermal_zone_set_trip() is never NULL and the
check involving it is redundant.  Drop that check.

No functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2023-11-30 14:38:06 +01:00
Lukasz Luba
401888e720 thermal: gov_power_allocator: Rearrange initialization of local variables
Rearrange the initialization of local variables in allocate_power() so
as to improve code clarity and the visibility of the initial values.

This change is not expected to alter the general functionality.

Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-11-28 16:02:14 +01:00
Lukasz Luba
0458d536ae thermal: gov_power_allocator: Remove excessive local variables
Local variable 'ret' in allocate_power() is only used in the return
statement, so drop it.

Local variable 'trip_max' in allocate_power() is only used for caching
the params->trip_max value which may as well be accessed directly as
needed, so drop it either.

This change is not expected to alter the general functionality.

Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-11-28 16:02:14 +01:00
Lukasz Luba
30e1178c10 thermal: gov_power_allocator: Use shorter paths to access data when possible
The 'cdev' pointer in allow_maximum_power() is valid, so there is no
need to use 'instance->cdev' instead of it.

This change is not expected to alter the general functionality.

Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-11-28 16:02:14 +01:00
Lukasz Luba
499cc391b4 thermal: gov_power_allocator: Rearrange local variables
Rearrange the order of local variable definitions in multiple functions
so as to follow the kernel coding style in that respect.

Also, move local variable definitions located in nested code blocks to
the beginning of each function to improve the visibility of all local
variables in use.

This change is not expected to alter the general functionality.

Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-11-28 16:02:14 +01:00
Lukasz Luba
c7568e7841 thermal: gov_power_allocator: Check the cooling devices only for trip_max
The throttling logic only cares about the last passive trip point and
the cooling devices attached to it.

Therefore, there is no need to bail out if other trip points have
cooling devices which are not a supported by the IPA.

Check the cooling devices only for 'trip_max' during the binding.

Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-11-28 16:02:14 +01:00
Lukasz Luba
e83747c2f8 thermal: gov_power_allocator: Set up trip points earlier
Set up the trip points at the beginning of the binding function.

This simplifies the code a bit and allows for further cleanups.

Also add a check to fail the binding if the last passive trip point is
not found.

Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-11-28 16:02:13 +01:00
Lukasz Luba
4e6d4687f7 thermal: gov_power_allocator: Rename trip_max_desired_temperature
Refactor the code and rename the last passive trip point field.

There is a comment describing the field properly. Use shorter field name
so as to allow to clarify the code.

This change is not expected to alter the general functionality.

Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-11-28 16:02:13 +01:00
Rafael J. Wysocki
f475079881 thermal: ACPI: Move the ACPI thermal library to drivers/acpi/
The ACPI thermal library contains functions that can be used to
retrieve trip point temperature values through the platform firmware
for various types of trip points.  Each of these functions basically
evaluates a specific ACPI object, checks if the value produced by it
is reasonable and returns it (or THERMAL_TEMP_INVALID if anything
fails).

It made sense to hold it in drivers/thermal/ so long as it was only used
by the code in that directory, but since it is also going to be used by
the ACPI thermal driver located in drivers/acpi/, move it to the latter
in order to keep the code related to evaluating ACPI objects defined in
the specification proper together.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-11-21 15:04:08 +01:00
Rafael J. Wysocki
44844db913 thermal: core: Add trip thresholds for trip crossing detection
The trip crossing detection in handle_thermal_trip() does not work
correctly in the cases when a trip point is crossed on the way up and
then the zone temperature stays above its low temperature (that is, its
temperature decreased by its hysteresis).  The trip temperature may
be passed by the zone temperature subsequently in that case, even
multiple times, but that does not count as the trip crossing as long as
the zone temperature does not fall below the trip's low temperature or,
in other words, until the trip is crossed on the way down.

|-----------low--------high------------|
             |<--------->|
             |    hyst   |
             |           |
             |          -|--> crossed on the way up
             |
         <---|-- crossed on the way down

However, handle_thermal_trip() will invoke thermal_notify_tz_trip_up()
every time the trip temperature is passed by the zone temperature on
the way up regardless of whether or not the trip has been crossed on
the way down yet.  Moreover, it will not call thermal_notify_tz_trip_down()
if the last zone temperature was between the trip's temperature and its
low temperature, so some "trip crossed on the way down" events may not
be reported.

To address this issue, introduce trip thresholds equal to either the
temperature of the given trip, or its low temperature, such that if
the trip's threshold is passed by the zone temperature on the way up,
its value will be set to the trip's low temperature and
thermal_notify_tz_trip_up() will be called, and if the trip's threshold
is passed by the zone temperature on the way down, its value will be set
to the trip's temperature (high) and thermal_notify_tz_trip_down() will
be called.  Accordingly, if the threshold is passed on the way up, it
cannot be passed on the way up again until its passed on the way down
and if it is passed on the way down, it cannot be passed on the way down
again until it is passed on the way up which guarantees correct
triggering of trip crossing notifications.

If the last temperature of the zone is invalid, the trip's threshold
will be set depending of the zone's current temperature: If that
temperature is above the trip's temperature, its threshold will be
set to its low temperature or otherwise its threshold will be set to
its (high) temperature.  Because the zone temperature is initially
set to invalid and tz->last_temperature is only updated by
update_temperature(), this is sufficient to set the correct initial
threshold values for all trips.

Link: https://lore.kernel.org/all/20220718145038.1114379-4-daniel.lezcano@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-11-20 16:59:55 +01:00
Linus Torvalds
4ac4677fdb Thermal control updates for 6.7-rc1
- Untangle the initialization and updates of passive and active trip
    points in the ACPI thermal driver (Rafael Wysocki).
 
  - Reduce code duplication related to the initialization and updates
    of trip points in the ACPI thermal driver (Rafael Wysocki).
 
  - Use trip pointers for cooling device binding in the ACPI thermal
    driver (Rafael Wysocki).
 
  - Simplify critical and hot trips representation in the ACPI thermal
    driver (Rafael Wysocki).
 
  - Use trip pointers in thermal governors and in the related part of
    the thermal core (Rafael Wysocki).
 
  - Drop the trips_disabled bitmask that has become redundant from the
    thermal core (Rafael Wysocki).
 
  - Avoid updating trip points when the thermal zone temperature falls
    into a trip point's hysteresis range (ícolas F. R. A. Prado).
 
  - Add power floor notifications support to the int340x thermal control
    driver (Srinivas Pandruvada).
 
  - Rework updating trip points in the int340x thermal driver so that it
    does not access thermal zone internals directly (Rafael Wysocki).
 
  - Use param_get_byte() instead of param_get_int() as the max_idle module
    parameter .get() callback in the Intel powerclamp thermal driver to
    avoid possible out-of-bounds access (David Arcari).
 
  - Add workload hints support to the int340x thermal driver (Srinivas
    Pandruvada).
 
  - Add support for Mediatek LVTS MT8192 along with suspend/resume
    routines (Balsam Chihi).
 
  - Fix probe for THERMAL_V2 in the Mediatek LVTS driver (Markus
    Schneider-Pargmann).
 
  - Remove duplicate error message from the max76620 driver when
    thermal_of_zone_register() fails (Thierry Reding).
 
  - Add i.MX7D compatible bindings to fix a warning from dtbs_check for
    the imx6ul platform (Alexander Stein).
 
  - Add sa8775p compatible to the QCom tsens driver (Priyansh Jain).
 
  - Fix error check in lvts_debugfs_init() to be against PTR_ERR() in the
    LVTS Mediatek driver (Minjie Du).
 
  - Remove unused variable in thermal/tools (Kuan-Wei Chiu).
 
  - Document the imx8dl thermal sensor (Fabio Estevam).
 
  - Add variable names in callback prototypes to prevent warning from
    checkpatch.pl in the imx8mm driver (Bragatheswaran Manickavel).
 
  - Add missing unevaluatedProperties on child node schemas for tegra124
    (Rob Herring)
 
  - Add mt7988 support to the Mediatek LVTS driver (Frank Wunderlich).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmU6a+sSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRx9s8P/A/u3d8jn5Dj7B4N3Y4uTL1IG3aleGnZ
 EDxoiW6ju723s9ZMGsJg6qzXoksJDjleUTGtdyGYgbPyFz44/6/bcYqI6H0GsHlH
 lah5jAgXEaJdbc9CwwhjBoVEN6wO5ATyDIJ1H+iU0x/6svZpXN+tlQp2dAFacAxD
 CyZkFrBrSrEercO59NxCTyA9n9hfAfHdX7hBd5e686SXtcR+wdLyLfrPiEoE4oYq
 vW9QkzTSK+nILrLTSGnrmtYvsOzbyRKqoIPrRhqSwgobvDHvYgkokdESYGzu/d3F
 k922eHV8C4RyFZmVT9o+dVaK27gZ3KQI3d8BfpitMYD/Tb0N4axmwkZ44rrA8OOy
 kpTFENvlICyUpfcTqwLYIZT9WK7rXKpxqjLeKMZvzrUWBoNm8tFsuJy3Hxd0IP6V
 F2X62UH/sA4QAtJHD0VP5mc7FRbuFBdpyQ6Tq9ZNUQGoOeFT6VxQaqzRzqWwV6Na
 lxxvX3FST4lRguNRg9NW3RrOxQOyLaQP1TIN3CaZ0rOs2dlGz8v2UN252uM/K1Y0
 JO0rXS2HoXdUqWWaIMnVvNqOsaYeMGhVfCNySEn1DoChpInAZF50Z0/fL5Vpb51L
 cl0VrNxReHwDHeuRtKcFgH1S8YrsehrRAZp3bDArmD/XszJQR+Rv5N6BSdURDLSU
 1mkfn37l8fKc
 =PSaG
 -----END PGP SIGNATURE-----

Merge tag 'thermal-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull thermal control updates from Rafael Wysocki:
 "These further rework the ACPI thermal driver, after the changes made
  to it in the previous cycle, to make it easier to grasp, get rid of
  redundant pieces of internal data structures and eliminate its
  reliance on a specific ordering of trip point objects in the thermal
  core, make thermal core adjustments needed for the ACPI thermal driver
  rework, modify the thermal governor interface so as to use trip
  pointers for representing trip points in it, switch over multiple
  thermal drivers to using void platform driver remove callbacks, add
  support for 2 hardware features to the Intel int340x thermal driver,
  add support for new hardware on ARM platforms, update documentation,
  fix problems, clean up code and update the MAINTAINERS record for
  thermal control.

  Specifics:

   - Untangle the initialization and updates of passive and active trip
     points in the ACPI thermal driver (Rafael Wysocki)

   - Reduce code duplication related to the initialization and updates
     of trip points in the ACPI thermal driver (Rafael Wysocki)

   - Use trip pointers for cooling device binding in the ACPI thermal
     driver (Rafael Wysocki)

   - Simplify critical and hot trips representation in the ACPI thermal
     driver (Rafael Wysocki)

   - Use trip pointers in thermal governors and in the related part of
     the thermal core (Rafael Wysocki)

   - Drop the trips_disabled bitmask that has become redundant from the
     thermal core (Rafael Wysocki)

   - Avoid updating trip points when the thermal zone temperature falls
     into a trip point's hysteresis range (ícolas F. R. A. Prado)

   - Add power floor notifications support to the int340x thermal
     control driver (Srinivas Pandruvada)

   - Rework updating trip points in the int340x thermal driver so that
     it does not access thermal zone internals directly (Rafael
     Wysocki)

   - Use param_get_byte() instead of param_get_int() as the max_idle
     module parameter .get() callback in the Intel powerclamp thermal
     driver to avoid possible out-of-bounds access (David Arcari)

   - Add workload hints support to the int340x thermal driver (Srinivas
     Pandruvada)

   - Add support for Mediatek LVTS MT8192 along with suspend/resume
     routines (Balsam Chihi)

   - Fix probe for THERMAL_V2 in the Mediatek LVTS driver (Markus
     Schneider-Pargmann)

   - Remove duplicate error message from the max76620 driver when
     thermal_of_zone_register() fails (Thierry Reding)

   - Add i.MX7D compatible bindings to fix a warning from dtbs_check for
     the imx6ul platform (Alexander Stein)

   - Add sa8775p compatible to the QCom tsens driver (Priyansh Jain)

   - Fix error check in lvts_debugfs_init() to be against PTR_ERR() in
     the LVTS Mediatek driver (Minjie Du)

   - Remove unused variable in thermal/tools (Kuan-Wei Chiu)

   - Document the imx8dl thermal sensor (Fabio Estevam)

   - Add variable names in callback prototypes to prevent warning from
     checkpatch.pl in the imx8mm driver (Bragatheswaran Manickavel)

   - Add missing unevaluatedProperties on child node schemas for
     tegra124 (Rob Herring)

   - Add mt7988 support to the Mediatek LVTS driver (Frank Wunderlich)"

* tag 'thermal-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (111 commits)
  thermal: ACPI: Include the right header file
  thermal: core: Don't update trip points inside the hysteresis range
  thermal: core: Pass trip pointer to governor throttle callback
  thermal: gov_step_wise: Fold update_passive_instance() into its caller
  thermal: gov_power_allocator: Use trip pointers instead of trip indices
  thermal: gov_fair_share: Rearrange get_trip_level()
  thermal: trip: Define for_each_trip() macro
  thermal: trip: Simplify computing trip indices
  thermal/qcom/tsens: Drop ops_v0_1
  thermal/drivers/mediatek/lvts_thermal: Update calibration data documentation
  thermal/drivers/mediatek/lvts_thermal: Add mt8192 support
  thermal/drivers/mediatek/lvts_thermal: Add suspend and resume
  dt-bindings: thermal: mediatek: Add LVTS thermal controller definition for mt8192
  thermal/drivers/mediatek: Fix probe for THERMAL_V2
  thermal/drivers/max77620: Remove duplicate error message
  dt-bindings: timer: add imx7d compatible
  dt-bindings: net: microchip: Allow nvmem-cell usage
  dt-bindings: imx-thermal: Add #thermal-sensor-cells property
  dt-bindings: thermal: tsens: Add sa8775p compatible
  thermal/drivers/mediatek/lvts_thermal: Fix error check in lvts_debugfs_init()
  ...
2023-10-31 15:28:37 -10:00
Linus Torvalds
befaa609f4 hardening updates for v6.7-rc1
- Add LKDTM test for stuck CPUs (Mark Rutland)
 
 - Improve LKDTM selftest behavior under UBSan (Ricardo Cañuelo)
 
 - Refactor more 1-element arrays into flexible arrays (Gustavo A. R. Silva)
 
 - Analyze and replace strlcpy and strncpy uses (Justin Stitt, Azeem Shaikh)
 
 - Convert group_info.usage to refcount_t (Elena Reshetova)
 
 - Add __counted_by annotations (Kees Cook, Gustavo A. R. Silva)
 
 - Add Kconfig fragment for basic hardening options (Kees Cook, Lukas Bulwahn)
 
 - Fix randstruct GCC plugin performance mode to stay in groups (Kees Cook)
 
 - Fix strtomem() compile-time check for small sources (Kees Cook)
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmU/3cUWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJsEoEACBGPSiOmfSWdH3TOnIG270PD24
 jGjg8KFv7RC/JTOdYmpLl0okdlGT9LvjN/ToSSDEw3PIayxoXUdhkbYy0MYtiV3m
 yz2ozDTzJuplQX/W2fPE+nXSzIwHao2zjPPFjHnT7lt8IIjhgjiOtLfZ2gGUkW99
 Mdu2aWh3u0r4tC8OS23++yN5ibRc5l72efsjDWjZ0aPXnxE1bjmLMiIPiizpndIf
 beasPuDBs98sJVYouemCwnsPXuXOPz3Q1Cpo/fTd+TMTJCLSemCQZCTuOBU0acI/
 ZjLCgCaJU1yIYKBMtrIN4G9kITZniXX3/Nm4o6NQMVlcCqMeNaHuflomqWoqWfhE
 UPbRo2eghZOaMNiCKLLvZDIqPrh1IcsiEl6Ef3W4hICc42GTK96IuGisIvDXwQ4N
 /SzTOupJuN42noh3z1M3XuZy5RoXJ99IYDNY5CTKf9IdqvA0bbGkU3nb1gZH/xw9
 BjTqKzR/7K1kTXuSgagDZ1Wceej9pZxhX7E3IHYsP8ZOvKug3EeL4yybVwQ3HRfq
 Qnzcp/qPB9cOkLSQXveRTFTsj2mX28Gixct/iDuc1jIYwGQlY1gI6dcUcqby6ptM
 BrQti7eR2NH2+T3aE2UVCIWsZVhx7NaSF+z8JxfAuu56jicc4xJVsi8zrNveWX5M
 m2VXyBl3121BVtKi4w==
 =0iVF
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening updates from Kees Cook:
 "One of the more voluminous set of changes is for adding the new
  __counted_by annotation[1] to gain run-time bounds checking of
  dynamically sized arrays with UBSan.

   - Add LKDTM test for stuck CPUs (Mark Rutland)

   - Improve LKDTM selftest behavior under UBSan (Ricardo Cañuelo)

   - Refactor more 1-element arrays into flexible arrays (Gustavo A. R.
     Silva)

   - Analyze and replace strlcpy and strncpy uses (Justin Stitt, Azeem
     Shaikh)

   - Convert group_info.usage to refcount_t (Elena Reshetova)

   - Add __counted_by annotations (Kees Cook, Gustavo A. R. Silva)

   - Add Kconfig fragment for basic hardening options (Kees Cook, Lukas
     Bulwahn)

   - Fix randstruct GCC plugin performance mode to stay in groups (Kees
     Cook)

   - Fix strtomem() compile-time check for small sources (Kees Cook)"

* tag 'hardening-v6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (56 commits)
  hwmon: (acpi_power_meter) replace open-coded kmemdup_nul
  reset: Annotate struct reset_control_array with __counted_by
  kexec: Annotate struct crash_mem with __counted_by
  virtio_console: Annotate struct port_buffer with __counted_by
  ima: Add __counted_by for struct modsig and use struct_size()
  MAINTAINERS: Include stackleak paths in hardening entry
  string: Adjust strtomem() logic to allow for smaller sources
  hardening: x86: drop reference to removed config AMD_IOMMU_V2
  randstruct: Fix gcc-plugin performance mode to stay in group
  mailbox: zynqmp: Annotate struct zynqmp_ipi_pdata with __counted_by
  drivers: thermal: tsens: Annotate struct tsens_priv with __counted_by
  irqchip/imx-intmux: Annotate struct intmux_data with __counted_by
  KVM: Annotate struct kvm_irq_routing_table with __counted_by
  virt: acrn: Annotate struct vm_memory_region_batch with __counted_by
  hwmon: Annotate struct gsc_hwmon_platform_data with __counted_by
  sparc: Annotate struct cpuinfo_tree with __counted_by
  isdn: kcapi: replace deprecated strncpy with strscpy_pad
  isdn: replace deprecated strncpy with strscpy
  NFS/flexfiles: Annotate struct nfs4_ff_layout_segment with __counted_by
  nfs41: Annotate struct nfs4_file_layout_dsaddr with __counted_by
  ...
2023-10-30 19:09:55 -10:00
Rafael J. Wysocki
607218deac - Add support for Mediatek LVTS MT8192 driver along with the
suspend/resume routines (Balsam Chihi)
 
 - Fix probe for THERMAL_V2 for the Mediatek LVTS driver (Markus
   Schneider-Pargmann)
 
 - Remove duplicate error message in the max76620 driver when
   thermal_of_zone_register() fails as the sub routine already show one
   (Thierry Reding)
 
 - Add i.MX7D compatible bindings to fix a warning from dtbs_check for
   the imx6ul platform (Alexander Stein)
 
 - Add sa8775p compatible for the QCom tsens driver (Priyansh Jain)
 
 - Fix error check in lvts_debugfs_init() which is checking against
   NULL instead of PTR_ERR() on the LVTS Mediatek driver (Minjie Du)
 
 - Remove unused variable in the thermal/tools (Kuan-Wei Chiu)
 
 - Document the imx8dl thermal sensor (Fabio Estevam)
 
 - Add variable names in callback prototypes to prevent warning from
   checkpatch.pl for the imx8mm driver (Bragatheswaran Manickavel)
 
 - Add missing unevaluatedProperties on child node schemas for tegra124
   (Rob Herring)
 
 - Add mt7988 support for the Mediatek LVTS driver (Frank Wunderlich)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGn3N4YVz0WNVyHskqDIjiipP6E8FAmU35s8ACgkQqDIjiipP
 6E8nggf/VQei7x/fRTFRYJVyP6SdwzpZPcyMYEcwL50wpp/d/R5xapFzM7nofV6n
 SYFXkpuejGH5d5JbJjpkv1S56Q/Bf6yy7cGciKiBmxaU2cWY636KmhlUPe2kfmX5
 ipiNi/ZU0zAjZM7FRpBVPgp0HTvnGehDzIaSi6Y/vKVAw5TPTzjLz8oNoeHGP4/6
 xNRgJuTwAvqL1lqZoo9YV7tVdkcuQamf3dkDkHqvSSL9UilHZEqrtLyh1Iz12+py
 LESjlePz1jzK169oG5QiA4thIMV7CPTwkrkAJXwT2H2zWccg7JzZ7K1S3aJq2vHL
 5UMcsUE2wCczfdnlhxD6wkJ3CDyirA==
 =i0K9
 -----END PGP SIGNATURE-----

Merge tag 'thermal-v6.7-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux

Merge thermal control (ARM drivers mostly) updates for 6.7-rc1 from
Daniel Lezcano:

"- Add support for Mediatek LVTS MT8192 driver along with the
   suspend/resume routines (Balsam Chihi)

 - Fix probe for THERMAL_V2 for the Mediatek LVTS driver (Markus
   Schneider-Pargmann)

 - Remove duplicate error message in the max76620 driver when
   thermal_of_zone_register() fails as the sub routine already show one
   (Thierry Reding)

 - Add i.MX7D compatible bindings to fix a warning from dtbs_check for
   the imx6ul platform (Alexander Stein)

 - Add sa8775p compatible for the QCom tsens driver (Priyansh Jain)

 - Fix error check in lvts_debugfs_init() which is checking against
   NULL instead of PTR_ERR() on the LVTS Mediatek driver (Minjie Du)

 - Remove unused variable in the thermal/tools (Kuan-Wei Chiu)

 - Document the imx8dl thermal sensor (Fabio Estevam)

 - Add variable names in callback prototypes to prevent warning from
   checkpatch.pl for the imx8mm driver (Bragatheswaran Manickavel)

 - Add missing unevaluatedProperties on child node schemas for tegra124
  (Rob Herring)

 - Add mt7988 support for the Mediatek LVTS driver (Frank Wunderlich)"

* tag 'thermal-v6.7-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux:
  thermal/qcom/tsens: Drop ops_v0_1
  thermal/drivers/mediatek/lvts_thermal: Update calibration data documentation
  thermal/drivers/mediatek/lvts_thermal: Add mt8192 support
  thermal/drivers/mediatek/lvts_thermal: Add suspend and resume
  dt-bindings: thermal: mediatek: Add LVTS thermal controller definition for mt8192
  thermal/drivers/mediatek: Fix probe for THERMAL_V2
  thermal/drivers/max77620: Remove duplicate error message
  dt-bindings: timer: add imx7d compatible
  dt-bindings: net: microchip: Allow nvmem-cell usage
  dt-bindings: imx-thermal: Add #thermal-sensor-cells property
  dt-bindings: thermal: tsens: Add sa8775p compatible
  thermal/drivers/mediatek/lvts_thermal: Fix error check in lvts_debugfs_init()
  tools/thermal: Remove unused 'mds' and 'nrhandler' variables
  dt-bindings: thermal: fsl,scu-thermal: Document imx8dl
  thermal/drivers/imx8mm_thermal: Fix function pointer declaration by adding identifier name
  dt-bindings: thermal: nvidia,tegra124-soctherm: Add missing unevaluatedProperties on child node schemas
  thermal/drivers/mediatek/lvts_thermal: Add mt7988 support
  thermal/drivers/mediatek/lvts_thermal: Make coeff configurable
  dt-bindings: thermal: mediatek: Add LVTS thermal sensors for mt7988
  dt-bindings: thermal: mediatek: Add mt7988 lvts compatible
2023-10-25 14:41:10 +02:00
Rafael J. Wysocki
8aa49284f3 Merge branch 'thermal-intel'
Merge changes in Intel thermal control drivers for 6.7-rc1:

 - Add power floor notifications support to the int340x thermal control
   driver (Srinivas Pandruvada).

 - Rework updating trip points in the int340x thermal driver so that it
   does not access thermal zone internals directly (Rafael Wysocki).

 - Use param_get_byte() instead of param_get_int() as the max_idle module
   parameter .get() callback in the Intel powerclamp thermal driver to
   avoid possible out-of-bounds access (David Arcari).

 - Add workload hints support to the the int340x thermal driver (Srinivas
   Pandruvada).

* thermal-intel:
  selftests/thermel/intel: Add test to read power floor status
  thermal: int340x: processor_thermal: Enable power floor support
  thermal: int340x: processor_thermal: Handle power floor interrupts
  thermal: int340x: processor_thermal: Support power floor notifications
  thermal: int340x: processor_thermal: Set feature mask before proc_thermal_add
  thermal: int340x: processor_thermal: Common function to clear SOC interrupt
  thermal: int340x: processor_thermal: Move interrupt status MMIO offset to common header
  thermal: intel: powerclamp: fix mismatch in get function for max_idle
  thermal: int340x: Use thermal_zone_for_each_trip()
  thermal: int340x: processor_thermal: Ack all PCI interrupts
  thermal: int340x: Add ArrowLake-S PCI ID
  selftests/thermel/intel: Add test to read workload hint
  thermal: int340x: Handle workload hint interrupts
  thermal: int340x: processor_thermal: Add workload type hint interface
  thermal: int340x: Remove PROC_THERMAL_FEATURE_WLT_REQ for Meteor Lake
  thermal: int340x: processor_thermal: Use non MSI interrupts by default
  thermal: int340x: processor_thermal: Add interrupt configuration function
  thermal: int340x: processor_thermal: Move mailbox code to common module
2023-10-23 20:06:58 +02:00
Rafael J. Wysocki
598c20f964 Merge branch 'thermal-core'
Merge thermal core changes for 6.7-rc1:

 - Use trip pointers in thermal governors and in the related part of
   the thermal core (Rafael Wysocki).

 - Avoid updating trip points when the thermal zone temperature falls
   into a trip point's hysteresis range (ícolas F. R. A. Prado).

* thermal-core:
  thermal: ACPI: Include the right header file
  thermal: core: Don't update trip points inside the hysteresis range
  thermal: core: Pass trip pointer to governor throttle callback
  thermal: gov_step_wise: Fold update_passive_instance() into its caller
  thermal: gov_power_allocator: Use trip pointers instead of trip indices
  thermal: gov_fair_share: Rearrange get_trip_level()
  thermal: trip: Define for_each_trip() macro
  thermal: trip: Simplify computing trip indices
2023-10-23 20:00:51 +02:00
Rafael J. Wysocki
27fa2b6043 Merge branch 'acpi-thermal'
Merge ACPI thermal driver changes are related thermal core changes for
v6.7-rc1:

 - Untangle the initialization and updates of passive and active trip
   points in the ACPI thermal driver (Rafael Wysocki).

 - Reduce code duplication related to the initialization and updates
   of trip points in the ACPI thermal driver (Rafael Wysocki).

 - Use trip pointers for cooling device binding in the ACPI thermal
   driver (Rafael Wysocki).

 - Simplify critical and hot trips representation in the ACPI thermal
   driver (Rafael Wysocki).

* acpi-thermal: (26 commits)
  thermal: trip: Drop lockdep assertion from thermal_zone_trip_id()
  thermal: trip: Remove lockdep assertion from for_each_thermal_trip()
  thermal: core: Drop thermal_zone_device_exec()
  ACPI: thermal: Use thermal_zone_for_each_trip() for updating trips
  ACPI: thermal: Combine passive and active trip update functions
  ACPI: thermal: Move get_active_temp()
  ACPI: thermal: Fix up function header formatting in two places
  ACPI: thermal: Drop list of device ACPI handles from struct acpi_thermal
  ACPI: thermal: Rename structure fields holding temperature in deci-Kelvin
  ACPI: thermal: Drop critical_valid and hot_valid trip flags
  ACPI: thermal: Do not use trip indices for cooling device binding
  ACPI: thermal: Mark uninitialized active trips as invalid
  ACPI: thermal: Merge trip initialization functions
  ACPI: thermal: Collapse trip devices update function wrappers
  ACPI: thermal: Collapse trip devices update functions
  ACPI: thermal: Add device list to struct acpi_thermal_trip
  ACPI: thermal: Fix a small leak in acpi_thermal_add()
  ACPI: thermal: Drop valid flag from struct acpi_thermal_trip
  ACPI: thermal: Drop redundant trip point flags
  ACPI: thermal: Untangle initialization and updates of active trips
  ...
2023-10-23 19:51:26 +02:00
Rafael J. Wysocki
c27d08f786 thermal: ACPI: Include the right header file
It is not necessary to include thermal_core.h into thermal_acpi.c,
because none of the code in there depends on anything in the former,
except for the linux/thermal.h, but it is better to include that one
directly instead of including the entire thermal_core.h, so make that
change.

No functional impact.

Fixes: 7a0e397488 ("thermal: ACPI: Add ACPI trip point routines")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2023-10-20 19:26:37 +02:00
Nícolas F. R. A. Prado
cf3986f8c0 thermal: core: Don't update trip points inside the hysteresis range
When searching for the trip points that need to be set, the nearest
higher trip point's temperature is used for the high trip, while the
nearest lower trip point's temperature minus the hysteresis is used for
the low trip. The issue with this logic is that when the current
temperature is inside a trip point's hysteresis range, both high and low
trips will come from the same trip point. As a consequence instability
can still occur like this:
* the temperature rises slightly and enters the hysteresis range of a
  trip point
* polling happens and updates the trip points to the hysteresis range
* the temperature falls slightly, exiting the hysteresis range, crossing
  the trip point and triggering an IRQ, the trip points are updated
* repeat

So even though the current hysteresis implementation prevents
instability from happening due to IRQs triggering on the same
temperature value, both ways, it doesn't prevent it from happening due
to an IRQ on one way and polling on the other.

To properly implement a hysteresis behavior, when inside the hysteresis
range, don't update the trip points. This way, the previously set trip
points will stay in effect, which will in a way remember the previous
state (if the temperature signal came from above or below the range) and
therefore have the right trip point already set.

The exception is if there was no previous trip point set, in which case
a previous state doesn't exist, and so it's sensible to allow the
hysteresis range as trip points.

The following logs show the current behavior when running on a real
machine:

[  202.524658] thermal thermal_zone0: new temperature boundaries: -2147483647 < x < 40000
   203.562817: thermal_temperature: thermal_zone=vpu0-thermal id=0 temp_prev=36986 temp=37979
[  203.562845] thermal thermal_zone0: new temperature boundaries: 37000 < x < 40000
   204.176059: thermal_temperature: thermal_zone=vpu0-thermal id=0 temp_prev=37979 temp=40028
[  204.176089] thermal thermal_zone0: new temperature boundaries: 37000 < x < 100000
   205.226813: thermal_temperature: thermal_zone=vpu0-thermal id=0 temp_prev=40028 temp=38652
[  205.226842] thermal thermal_zone0: new temperature boundaries: 37000 < x < 40000

And with this patch applied:

[  184.933415] thermal thermal_zone0: new temperature boundaries: -2147483647 < x < 40000
   185.981182: thermal_temperature: thermal_zone=vpu0-thermal id=0 temp_prev=36986 temp=37872
   186.744685: thermal_temperature: thermal_zone=vpu0-thermal id=0 temp_prev=37872 temp=40058
[  186.744716] thermal thermal_zone0: new temperature boundaries: 37000 < x < 100000
   187.773284: thermal_temperature: thermal_zone=vpu0-thermal id=0 temp_prev=40058 temp=38698

Fixes: 060c034a97 ("thermal: Add support for hardware-tracked trip points")
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Co-developed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-10-20 19:26:37 +02:00
Rafael J. Wysocki
8c35b1f472 thermal: core: Pass trip pointer to governor throttle callback
Modify the governor .throttle() callback definition so that it takes a
trip pointer instead of a trip index as its second argument, adjust the
governors accordingly and update the core code invoking .throttle().

This causes the governors to become independent of the representation
of the list of trips in the thermal zone structure.

This change is not expected to alter the general functionality.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2023-10-20 19:26:37 +02:00
Rafael J. Wysocki
fdcf70ed4e thermal: gov_step_wise: Fold update_passive_instance() into its caller
Fold update_passive_instance() into thermal_zone_trip_update() that is
its only caller so as to make the code in question easier to follow.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2023-10-20 19:26:37 +02:00
Rafael J. Wysocki
94be1d27aa thermal: gov_power_allocator: Use trip pointers instead of trip indices
Modify the power allocator thermal governor to use trip pointers instead
of trip indices everywhere except for the power_allocator_throttle()
second argument that will be changed subsequently along with the
definition of the .throttle() governor callback.

The general functionality is not expected to be changed.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
2023-10-20 19:26:37 +02:00
Rafael J. Wysocki
276f1ede95 thermal: gov_fair_share: Rearrange get_trip_level()
Make get_trip_level() use for_each_trip() to iterate over trip points
and make it call thermal_zone_trip_id() to obtain the integer ID of a
given trip point so as to avoid relying on the knowledge of struct
thermal_zone_device internals.

The general functionality is not expected to be changed.

This change causes the governor to use trip pointers instead of trip
indices everywhere except for the fair_share_throttle() second argument
that will be modified subsequently along with the definition of the
governor .throttle() callback.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2023-10-20 19:26:37 +02:00
Rafael J. Wysocki
234ed6f5fb thermal: trip: Define for_each_trip() macro
Define a new macro for_each_trip() to be used by the thermal core code
and thermal governors for walking trips in a given thermal zone.

Modify for_each_thermal_trip() to use this macro instead of an open-
coded loop over trips.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2023-10-20 19:26:01 +02:00
Rafael J. Wysocki
78869767f2 thermal: trip: Simplify computing trip indices
A trip index can be computed right away as a difference between the
value of a trip pointer pointing to the given trip object and the
start of the trips[] table in the given thermal zone, so change
thermal_zone_trip_id() accordingly.

No intentional functional impact (except for some speedup).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
2023-10-20 19:25:22 +02:00
Dmitry Baryshkov
9618efe343 thermal/qcom/tsens: Drop ops_v0_1
Since the commit 6812d1dfbc ("thermal/drivers/qcom/tsens-v0_1: Fix
mdm9607 slope values") the default v0.1 implementation of tsens
options is unused by the driver. Drop it now to stop compiler
complaining about the unused static const. If it appears there is the
need for the default v0.1 ops struct, this commit can be easily
reverted without further considerations.

Fixes: 6812d1dfbc ("thermal/drivers/qcom/tsens-v0_1: Fix mdm9607 slope values")
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231019144311.1035181-1-dmitry.baryshkov@linaro.org
2023-10-19 17:13:12 +02:00
Balsam CHIHI
5437d14d94 thermal/drivers/mediatek/lvts_thermal: Update calibration data documentation
Update LVTS calibration data documentation for mt8192 and mt8195.

Signed-off-by: Balsam CHIHI <bchihi@baylibre.com>
Reviewed-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
[bero@baylibre.com: Fix issues pointed out by Nícolas F. R. A. Prado <nfraprado@collabora.com>]
Signed-off-by: Bernhard Rosenkränzer <bero@baylibre.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: Alexandre Mergnat <amergnat@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231017190545.157282-6-bero@baylibre.com
2023-10-19 02:43:32 +02:00
Balsam CHIHI
288732242d thermal/drivers/mediatek/lvts_thermal: Add mt8192 support
Add LVTS Driver support for MT8192.

Co-developed-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Signed-off-by: Balsam CHIHI <bchihi@baylibre.com>
Reviewed-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
[bero@baylibre.com: cosmetic changes, rebase]
Signed-off-by: Bernhard Rosenkränzer <bero@baylibre.com>
Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com>
Reviewed-by: Alexandre Mergnat <amergnat@baylibre.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231017190545.157282-4-bero@baylibre.com
2023-10-19 02:43:06 +02:00
Balsam CHIHI
8137bb9060 thermal/drivers/mediatek/lvts_thermal: Add suspend and resume
Add suspend and resume support to LVTS driver.

Signed-off-by: Balsam CHIHI <bchihi@baylibre.com>
[bero@baylibre.com: suspend/resume in noirq phase]
Co-developed-by: Bernhard Rosenkränzer <bero@baylibre.com>
Signed-off-by: Bernhard Rosenkränzer <bero@baylibre.com>
Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com>
Reviewed-by: Alexandre Mergnat <amergnat@baylibre.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231017190545.157282-3-bero@baylibre.com
2023-10-19 02:41:17 +02:00
Markus Schneider-Pargmann
5055fadfa7 thermal/drivers/mediatek: Fix probe for THERMAL_V2
Fix the probe function to call mtk_thermal_release_periodic_ts for
everything != MTK_THERMAL_V1. This was accidentally changed from V1
to V2 in the original patch.

Reported-by: Frank Wunderlich <frank-w@public-files.de>
Closes: https://lore.kernel.org/lkml/B0B3775B-B8D1-4284-814F-4F41EC22F532@public-files.de/
Reported-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Closes: https://lore.kernel.org/lkml/07a569b9-e691-64ea-dd65-3b49842af33d@linaro.org/
Fixes: 33140e668b ("thermal/drivers/mediatek: Control buffer enablement tweaks")
Signed-off-by: Markus Schneider-Pargmann <msp@baylibre.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20230918100706.1229239-1-msp@baylibre.com
2023-10-17 11:16:01 +02:00
Thierry Reding
5368084c39 thermal/drivers/max77620: Remove duplicate error message
The thermal_of_zone_register() function already prints an error message
when appropriate, so remove the extra one from the MAX77620 thermal
driver.

This fixes a spurious error message when no thermal zone was defined
for the MAX77620 in device tree.

Reported-by: Nicolas Chauvet <kwizart@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20231013155104.1781197-1-thierry.reding@gmail.com
2023-10-15 23:40:10 +02:00
Minjie Du
2ffa39c83b thermal/drivers/mediatek/lvts_thermal: Fix error check in lvts_debugfs_init()
debugfs_create_dir() function returns an error value embedded in
the pointer (PTR_ERR). Evaluate the return value using IS_ERR
rather than checking for NULL.

Signed-off-by: Minjie Du <duminjie@vivo.com>
Reviewed-by: Alexandre Mergnat <amergnat@baylibre.com>
Reviewed-by: Chen-Yu Tsai <wenst@chromium.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20230921091057.3812-1-duminjie@vivo.com
2023-10-15 23:40:10 +02:00
Bragatheswaran Manickavel
f84f6e0f45 thermal/drivers/imx8mm_thermal: Fix function pointer declaration by adding identifier name
Added identifier names to respective definitions for fix
warnings reported by checkpatch.pl

WARNING: function definition argument 'void *' should also have an identifier name
WARNING: function definition argument 'int *' should also have an identifier name
Signed-off-by: Bragatheswaran Manickavel <bragathemanick0908@gmail.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20230917083443.3220-1-bragathemanick0908@gmail.com
2023-10-15 23:40:09 +02:00
Frank Wunderlich
585e92e6a7 thermal/drivers/mediatek/lvts_thermal: Add mt7988 support
Add Support for Mediatek Filogic 880/MT7988 LVTS.

Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Tested-by: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20230922055020.6436-5-linux@fw-web.de
2023-10-15 23:40:09 +02:00
Frank Wunderlich
6725a29321 thermal/drivers/mediatek/lvts_thermal: Make coeff configurable
The upcoming mt7988 has different temperature coefficients so we
cannot use constants in the functions lvts_golden_temp_init,
lvts_golden_temp_init and lvts_raw_to_temp anymore.

Add a field in the lvts_ctrl pointing to the lvts_data which now
contains the soc-specific temperature coefficents.

To make the code better readable, rename static int coeff_b to
golden_temp_offset, COEFF_A to temp_factor and COEFF_B to temp_offset.

Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Tested-by: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20230922055020.6436-4-linux@fw-web.de
2023-10-15 23:40:09 +02:00
Srinivas Pandruvada
0e50925392 thermal: int340x: processor_thermal: Enable power floor support
Enable power floor feature support for Meteor Lake processors.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-10-12 21:09:48 +02:00
Srinivas Pandruvada
8cd5ad18dd thermal: int340x: processor_thermal: Handle power floor interrupts
On thermal device interrupt, if the interrupt is generated for passing
power floor status, call the callback to pass notification to the user
space.

First call proc_thermal_check_power_floor_intr() to check interrupt, if
this callback returns true, wake the IRQ thread to call
proc_thermal_power_floor_intr_callback() to notify user space.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-10-12 21:09:48 +02:00
Srinivas Pandruvada
b473d6a9d6 thermal: int340x: processor_thermal: Support power floor notifications
When the hardware reduces the power to the minimum possible, the power
floor is notified via an interrupt.

This can happen when user space requests a power limit via powercap RAPL
interface, which forces the system to enter to the lowest power. This
power floor indication can be used as a hint to resort to other methods
of reducing power than via RAPL power limit.

Before power floor status can be read or the firmware can trigger
notifications regarding it, it needs to be configured via a mailbox
command. The actual power floor status is read via bit 39 of MMIO
offset 0x5B18 of the processor thermal PCI device.

To show the current power floor status and get notification
on a sysfs attribute, add 2 new attributes to
/sys/bus/pci/devices/0000\:00\:04.0/power_limits/

power_floor_enable : This attribute is present when power floor
 notifications are supported. This attribute allows to enable/disable
 power floor notifications.

power_floor_status : This attribute is present when power floor
 notifications are supported. When enabled via power_floor_enable, this
 attribute shows the current power floor status.

The power floor implementation provides interfaces which are called
from the sysfs callbacks to enable/disable and read power floor
status. It also provides two additional interfaces to check if the
current processor thermal device interrupt is for power floor status
and to send notifications to user space.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
[ rjw: Changelog and documentation changes edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-10-12 21:09:48 +02:00
Srinivas Pandruvada
6ebc25d8b0 thermal: int340x: processor_thermal: Set feature mask before proc_thermal_add
The function proc_thermal_add() adds sysfs entries for power limits.

The feature mask of available features is not present at that time, so
it cannot be used by proc_thermal_add() to selectively create sysfs
attributes.

The feature mask is set by proc_thermal_mmio_add(), so modify the code
to call it before proc_thermal_add() so as to allow the latter to use
the feature mask.

There is no functional impact with this change.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-10-12 21:09:48 +02:00
Srinivas Pandruvada
088f16f352 thermal: int340x: processor_thermal: Common function to clear SOC interrupt
The SOC interrupt status register contains multiple interrupt sources
(workload hint interrupt and power floor interrupt). It is not possible
to clear individual interrupt source with read-modify-write, as it may
clear the new interrupt from the firmware after a read operation. It is
also not possible to set the interrupt status bit to 1 for the other
interrupt source, which is not part of clearing.

Hence, create a common function, to clear all status bits at once.

Call this function after processing all interrupt sources.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-10-12 21:09:48 +02:00
Srinivas Pandruvada
24e4c26202 thermal: int340x: processor_thermal: Move interrupt status MMIO offset to common header
Move define SOC_WT_RES_INT_STATUS_OFFSET to processor_thermal_device.h.

This way it can be reused in other modules.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-10-12 21:09:48 +02:00
Rafael J. Wysocki
d3bff62770 Merge branch 'thermal-misc'
Merge thermal control changes related to switching over platform drivers
to using void remove callbacks.

* thermal-misc: (31 commits)
  thermal: amlogic: Convert to platform remove callback returning void
  thermal: uniphier: Convert to platform remove callback returning void
  thermal: ti-bandgap: Convert to platform remove callback returning void
  thermal: tegra-bpmp: Convert to platform remove callback returning void
  thermal: soctherm: Convert to platform remove callback returning void
  thermal: stm: Convert to platform remove callback returning void
  thermal: sprd: Convert to platform remove callback returning void
  thermal: spear: Convert to platform remove callback returning void
  thermal: exynos_tmu: Convert to platform remove callback returning void
  thermal: rzg2l: Convert to platform remove callback returning void
  thermal: rockchip: Convert to platform remove callback returning void
  thermal: rcar: Convert to platform remove callback returning void
  thermal: rcar_gen3: Convert to platform remove callback returning void
  thermal: tsens: Convert to platform remove callback returning void
  thermal: lvts: Convert to platform remove callback returning void
  thermal: kirkwood: Convert to platform remove callback returning void
  thermal: k3_j72xx_bandgap: Convert to platform remove callback returning void
  thermal: k3_bandgap: Convert to platform remove callback returning void
  thermal: int3406: Convert to platform remove callback returning void
  thermal: int3403: Convert to platform remove callback returning void
  ...
2023-10-12 20:08:28 +02:00
Rafael J. Wysocki
a26b452e83 Merge branch 'acpi-thermal'
The ACPI thermal driver changes include some thermal core modifications
that are depended on by subsequent thermal core changes, so merge them.

* acpi-thermal: (26 commits)
  thermal: trip: Drop lockdep assertion from thermal_zone_trip_id()
  thermal: trip: Remove lockdep assertion from for_each_thermal_trip()
  thermal: core: Drop thermal_zone_device_exec()
  ACPI: thermal: Use thermal_zone_for_each_trip() for updating trips
  ACPI: thermal: Combine passive and active trip update functions
  ACPI: thermal: Move get_active_temp()
  ACPI: thermal: Fix up function header formatting in two places
  ACPI: thermal: Drop list of device ACPI handles from struct acpi_thermal
  ACPI: thermal: Rename structure fields holding temperature in deci-Kelvin
  ACPI: thermal: Drop critical_valid and hot_valid trip flags
  ACPI: thermal: Do not use trip indices for cooling device binding
  ACPI: thermal: Mark uninitialized active trips as invalid
  ACPI: thermal: Merge trip initialization functions
  ACPI: thermal: Collapse trip devices update function wrappers
  ACPI: thermal: Collapse trip devices update functions
  ACPI: thermal: Add device list to struct acpi_thermal_trip
  ACPI: thermal: Fix a small leak in acpi_thermal_add()
  ACPI: thermal: Drop valid flag from struct acpi_thermal_trip
  ACPI: thermal: Drop redundant trip point flags
  ACPI: thermal: Untangle initialization and updates of active trips
  ...
2023-10-11 17:56:51 +02:00
Rafael J. Wysocki
108ffd12be thermal: trip: Drop lockdep assertion from thermal_zone_trip_id()
The lockdep assertion in thermal_zone_trip_id() triggers when the
trip point sysfs attribute of a thermal instance is read, because
there is no thermal zone locking in that code path.

This is not verly useful, though, because there is no mechanism by which
the location of the trips[] table in a thermal zone or its size can
change after binding cooling devices to the trips in that thermal
zone and before those cooling devices are unbound from them.  Thus
it is not in fact necessary to hold the thermal zone lock when
thermal_zone_trip_id() is called from trip_point_show() and so the
lockdep asserion in the former is invalid.

Accordingly, drop that lockdep assertion.

Fixes: 2c7b4bfade ("thermal: core: Store trip pointer in struct thermal_instance")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-10-11 17:45:42 +02:00
Dan Carpenter
c99626092e thermal: core: prevent potential string overflow
The dev->id value comes from ida_alloc() so it's a number between zero
and INT_MAX.  If it's too high then these sprintf()s will overflow.

Fixes: 203d3d4aa4 ("the generic thermal sysfs driver")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-10-11 16:15:09 +02:00
Kees Cook
86748637bf drivers: thermal: tsens: Annotate struct tsens_priv with __counted_by
Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct tsens_priv.

[1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci

Cc: Andy Gross <agross@kernel.org>
Cc: Bjorn Andersson <andersson@kernel.org>
Cc: Konrad Dybcio <konrad.dybcio@linaro.org>
Cc: Amit Kucheria <amitk@kernel.org>
Cc: Thara Gopinath <thara.gopinath@gmail.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-arm-msm@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Reviewed-by: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20230922175341.work.919-kees@kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2023-10-08 22:46:05 -07:00
Rafael J. Wysocki
b44444027c thermal: trip: Remove lockdep assertion from for_each_thermal_trip()
The lockdep assertion in for_each_thermal_trip() was added to possibly
catch incorrect usage of that function without the thermal zone lock.
However, it turns out that the ACPI thermal driver has a legitimate
reason to call for_each_thermal_trip() without locking.

Namely, it is called by acpi_thermal_bind_unbind_cdev() in the thermal
zone registration and unregistration paths.  That function cannot acquire
the thermal zone lock by itself, because it calls functions that acquire
it, thermal_bind_cdev_to_trip() or thermal_unbind_cdev_from_trip().
However, it is invoked when the ACPI notify handler for the thermal
zone in question has not been registered yet (in the registration path)
or after that handler has been unregistered (in the unregistration
path).  Therefore, when for_each_thermal_trip() is called by
acpi_thermal_bind_unbind_cdev(), thermal trip changes induced by the
platform firmware cannot take place and so the thermal zone's trips[]
table is effectively immutable.  Hence, it is valid to call
for_each_thermal_trip() from acpi_thermal_bind_unbind_cdev() without
locking and the lockdep assertion in the former is in fact incorrect, so
remove it.

Fixes: d5ea889246 ("ACPI: thermal: Do not use trip indices for cooling device binding")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-10-07 13:29:22 +02:00
David Arcari
fae633cfb7 thermal: intel: powerclamp: fix mismatch in get function for max_idle
KASAN reported this

      [ 444.853098] BUG: KASAN: global-out-of-bounds in param_get_int+0x77/0x90
      [ 444.853111] Read of size 4 at addr ffffffffc16c9220 by task cat/2105
      ...
      [ 444.853442] The buggy address belongs to the variable:
      [ 444.853443] max_idle+0x0/0xffffffffffffcde0 [intel_powerclamp]

There is a mismatch between the param_get_int and the definition of
max_idle.  Replacing param_get_int with param_get_byte resolves this
issue.

Fixes: ebf5197102 ("thermal: intel: powerclamp: Add two module parameters")
Cc: 6.3+ <stable@vger.kernel.org> # 6.3+
Signed-off-by: David Arcari <darcari@redhat.com>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-10-05 20:18:03 +02:00
Rafael J. Wysocki
4963e34ce7 thermal: core: Drop thermal_zone_device_exec()
Because thermal_zone_device_exec() has no users any more and there are
no plans to use it anywhere, revert commit 9a99a996d1 ("thermal: core:
Introduce thermal_zone_device_exec()") that introduced it.

No functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2023-10-05 13:32:55 +02:00
Rafael J. Wysocki
cd3c00e776 thermal: int340x: Use thermal_zone_for_each_trip()
Modify int340x_thermal_update_trips() to use thermal_zone_for_each_trip()
for walking trips instead of using the trips[] table passed to the
thermal zone registration function.

For this purpose, store active trip point indices in the priv fieids of
the corresponding thermal_trip structures.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2023-10-05 13:31:15 +02:00
Rafael J. Wysocki
2cbe1a3331 Merge earlier changes in Intel thermal drivers for v6.7. 2023-10-05 13:24:25 +02:00
Rafael J. Wysocki
a56cc0a833 thermal: core: Add function to walk trips under zone lock
Add a wrapper around for_each_thermal_trip(), called
thermal_zone_for_each_trip(), that will invoke the former under the
thermal zone lock and pass its return value to the caller.

Two drivers will be modified subsequently to use this new function.

No functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2023-10-05 13:13:40 +02:00
Uwe Kleine-König
eea6c26207 thermal: amlogic: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

amlogic_thermal_disable() always returned zero. Change it to return no
value and then trivially convert the driver to .remove_new() and fix a
whitespace inconsitency en passant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-10-02 14:24:19 +02:00
Uwe Kleine-König
439f0bb3a7 thermal: uniphier: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-10-02 14:24:17 +02:00
Uwe Kleine-König
f3e38da002 thermal: ti-bandgap: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-10-02 14:24:15 +02:00
Uwe Kleine-König
f021f05262 thermal: tegra-bpmp: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-10-02 14:24:13 +02:00
Uwe Kleine-König
f1afede9e2 thermal: soctherm: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-10-02 14:24:11 +02:00
Uwe Kleine-König
ca92bdec59 thermal: stm: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-10-02 14:24:09 +02:00
Uwe Kleine-König
295b117645 thermal: sprd: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-10-02 14:24:08 +02:00
Uwe Kleine-König
7c2714a1e6 thermal: spear: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-10-02 14:24:06 +02:00
Uwe Kleine-König
0b478d7b86 thermal: exynos_tmu: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-10-02 14:24:02 +02:00
Uwe Kleine-König
24bbbfb73e thermal: rzg2l: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-10-02 14:23:51 +02:00
Uwe Kleine-König
cc86ac43e5 thermal: rockchip: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-10-02 14:23:30 +02:00
Uwe Kleine-König
03a5a75a6a thermal: rcar: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-09-29 12:34:17 +02:00
Uwe Kleine-König
a07f4487bc thermal: rcar_gen3: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-09-29 12:34:17 +02:00
Uwe Kleine-König
2128ba4639 thermal: tsens: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-09-29 12:34:17 +02:00
Uwe Kleine-König
2d3c823df2 thermal: lvts: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-09-29 12:34:17 +02:00
Uwe Kleine-König
3ebaf0f244 thermal: kirkwood: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-09-29 12:34:17 +02:00
Uwe Kleine-König
3c9e0f218c thermal: k3_j72xx_bandgap: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-09-29 12:34:17 +02:00
Uwe Kleine-König
ff96e615ea thermal: k3_bandgap: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-09-29 12:34:17 +02:00
Uwe Kleine-König
a876f99d12 thermal: int3406: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-09-29 12:34:17 +02:00
Uwe Kleine-König
2570108311 thermal: int3403: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-09-29 12:34:17 +02:00
Uwe Kleine-König
f287083ba7 thermal: int3402: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-09-29 12:34:16 +02:00
Uwe Kleine-König
8cc099170f thermal: int3401: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-09-29 12:34:16 +02:00
Uwe Kleine-König
75b66d7e59 thermal: int3400: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-09-29 12:34:16 +02:00
Uwe Kleine-König
02e7baaf51 thermal: imx: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-09-29 12:34:16 +02:00
Uwe Kleine-König
5568f642bb thermal: imx8mm: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-09-29 12:34:16 +02:00
Uwe Kleine-König
6abe2f00d2 thermal: hisi: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-09-29 12:34:16 +02:00
Uwe Kleine-König
7aa234e99f thermal: dove: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-09-29 12:34:16 +02:00
Uwe Kleine-König
0d0f8b2c4d thermal: da9062: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-09-29 12:34:16 +02:00
Uwe Kleine-König
aa599650d2 thermal: ns: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-09-29 12:34:16 +02:00
Uwe Kleine-König
f29ecd3748 thermal: bcm2835: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-09-29 12:34:16 +02:00
Uwe Kleine-König
4347e7d0b4 thermal: armada: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-09-29 12:34:16 +02:00
Rafael J. Wysocki
d069ed6b75 thermal: core: Allow trip pointers to be used for cooling device binding
Add new helper functions, thermal_bind_cdev_to_trip() and
thermal_unbind_cdev_from_trip(), to allow a trip pointer to be used for
binding a cooling device to a trip point and unbinding it, respectively,
and redefine the existing helpers, thermal_zone_bind_cooling_device()
and thermal_zone_unbind_cooling_device(), as wrappers around the new
ones, respectively.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2023-09-28 12:57:10 +02:00
Rafael J. Wysocki
2c7b4bfade thermal: core: Store trip pointer in struct thermal_instance
Replace the integer trip number stored in struct thermal_instance with
a pointer to the relevant trip and adjust the code using the structure
in question accordingly.

The main reason for making this change is to allow the trip point to
cooling device binding code more straightforward, as illustrated by
subsequent modifications of the ACPI thermal driver, but it also helps
to clarify the overall design and allows the governor code overhead to
be reduced (through subsequent modifications).

The only case in which it adds complexity is trip_point_show() that
needs to walk the trips[] table to find the index of the given trip
point, but this is not a critical path and the interface that
trip_point_show() belongs to is problematic anyway (for instance, it
doesn't cover the case when the same cooling devices is associated
with multiple trip points).

This is a preliminary change and the affected code will be refined by
a series of subsequent modifications of thermal governors, the core and
the ACPI thermal driver.

The general functionality is not expected to be affected by this change.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2023-09-28 12:55:29 +02:00
Rafael J. Wysocki
a15ffa783e thermal: trip: Drop redundant trips check from for_each_thermal_trip()
It is invalid to call for_each_thermal_trip() on an unregistered thermal
zone anyway, and as per thermal_zone_device_register_with_trips(), the
trips[] table must be present if num_trips is greater than zero for the
given thermal zone.

Hence, the trips check in for_each_thermal_trip() is redundant and so it
can be dropped.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2023-09-26 17:59:53 +02:00
Rafael J. Wysocki
9502108876 thermal: core: Drop trips_disabled bitmask
After recent changes, thermal_zone_get_trip() cannot fail, as invoked
from thermal_zone_device_register_with_trips(), so the only role of
the trips_disabled bitmask is struct thermal_zone_device is to make
handle_thermal_trip() skip trip points whose temperature was initially
zero.  However, since the unit of temperature in the thermal core is
millicelsius, zero may very well be a valid temperature value at least
in some usage scenarios and the trip temperature may as well change
later.  Thus there is no reason to permanently disable trip points
with initial temperature equal to zero.

Accordingly, drop the trips_disabled bitmask along with the code
related to it.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2023-09-25 11:46:19 +02:00
Srinivas Pandruvada
1ced5dce63 thermal: int340x: processor_thermal: Ack all PCI interrupts
All interrupts from the processor thermal PCI device require ACK. This
is done by writing 0x01 at offset 0xDC in the config space.

This is already done for the thereshold interrupt. Extend this for the
workload hint interrupt.

Fixes: e682b86211 ("thermal: int340x: Handle workload hint interrupts")
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-09-21 20:30:03 +02:00
Srinivas Pandruvada
a966a0da3b thermal: int340x: Add ArrowLake-S PCI ID
Add ArrowLake-S PCI ID for processor thermal device.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-09-18 13:14:59 +02:00
Rafael J. Wysocki
ea3105672c thermal: sysfs: Fix trip_point_hyst_store()
After commit 2e38a2a981 ("thermal/core: Add a generic thermal_zone_set_trip()
function") updating a trip point temperature doesn't actually work,
because the value supplied by user space is subsequently overwritten
with the current trip point hysteresis value.

Fix this by changing the code to parse the number string supplied by
user space after retrieving the current trip point data from the
thermal zone.

Also drop a redundant tab character from the code in question.

Fixes: 2e38a2a981 ("thermal/core: Add a generic thermal_zone_set_trip() function")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: 6.3+ <stable@vger.kernel.org> # 6.3+
2023-09-18 13:13:05 +02:00
Srinivas Pandruvada
e682b86211 thermal: int340x: Handle workload hint interrupts
On thermal device interrupt, if the interrupt is generated for passing
workload hint, call the callback to pass notification to the user
space.

First call proc_thermal_check_wt_intr() to check interrupt, if this
callback returns true, wake the IRQ thread to call
proc_thermal_wt_intr_callback() to notify user space.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
[ rjw: Subject adjustment, changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-09-14 21:52:19 +02:00
Srinivas Pandruvada
4b029a81c2 thermal: int340x: processor_thermal: Add workload type hint interface
Prior to Meteor Lake processor generation, user space can pass workload
type request to the firmware. Then firmware can optimize power based on
the indicated workload type. User space also uses workload type requests
to implement its own heuristics.

The firmware in Meteor Lake processor generation is capable of predicting
workload type without software help.

To avoid duplicate processing, add a sysfs interface allowing user space
to obtain the workload hint from the firmware instead of trying to
predict the workload type by itself.

This workload hint is passed from the firmware via MMIO offset 0x5B18 of
the processor thermal PCI device. Before workload hints can be produced by
the firmware, it needs to be configured via a mailbox command.  This
mailbox command turns ON the workload hint and it allows to program a
notification delay to control the rate of notifications.

The notification delay can be changed from user space vis sysfs.

Attribute group 'workload_hint' in sysfs is used for implementing the
workload hints interface between user space and the kernel.

It contains the following attributes:

workload_type_enable:
	Enables/disables workload type hints from the firmware.

notification_delay_ms:
	Notification delay in milliseconds.

workload_type_index:
	The current workload type index predicted by the firmware (see
	the documentation changes below for supported index values and
	their meaning).

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
[ rjw: Changelog edits, documentation edits, whitespace adjustments ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-09-14 21:52:19 +02:00
Srinivas Pandruvada
2f0b31c026 thermal: int340x: Remove PROC_THERMAL_FEATURE_WLT_REQ for Meteor Lake
Meteor Lake processor supports firmware hints for predicting workload
type. So, remove support for passing workload hints to the firmware.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
[ rjw: Subject adjustment ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-09-14 21:52:19 +02:00
Srinivas Pandruvada
f0658708e8 thermal: int340x: processor_thermal: Use non MSI interrupts by default
There are issues in using MSI interrupts for processor thermal device.
The support is not consistent across generations. But the legacy PCI
interrupts work on all current generations.

Hence always use legacy PCI interrupts by default, instead of MSI.
Add a module param to use of MSI, so that MSI can be still used.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-09-14 21:52:19 +02:00
Srinivas Pandruvada
dd28a3cb92 thermal: int340x: processor_thermal: Add interrupt configuration function
Some features like workload type prediction and power floor events
require interrupt support to avoid polling. Here interrupts are enabled
and disabled via sending mailbox commands. The mailbox command ID is
0x1E for read and 0x1F for write.

The interrupt configuration will require mutex protection as it involves
read-modify-write operation. Since mutex are already used in the mailbox
read/write functions: send_mbox_write_cmd() and send_mbox_read_cmd(),
there will be double locking. But, this can be avoided by moving mutexes
from mailbox read/write processing functions to the callers:
processor_thermal_send_mbox_[read|write]_cmd().

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
[ rjw: Adjust subject, fix up computation ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-09-14 21:52:19 +02:00
Srinivas Pandruvada
b894685cb8 thermal: int340x: processor_thermal: Move mailbox code to common module
The processor thermal mailbox is used for workload type request and
also in the processor thermal RFIM module. So, move the workload type
request code to its own module from the current processor thermal
mailbox module.

processor_thermal_mailbox.c contains only mailbox read/write related
source code. The source related to workload_types requests is moved to
a module processor_thermal_wt_req.c.

In addition
 - Rename PROC_THERMAL_FEATURE_MBOX to PROC_THERMAL_FEATURE_WT_REQ.
 - proc_thermal_mbox_add(), which adds workload type sysfs attribute group
   is renamed to proc_thermal_wt_req_add().
 - proc_thermal_mbox_remove() is renamed to proc_thermal_wt_req_remove().

While here, resolve check patch warnings for 100 columns for only modified
lines.

No functional changes are expected.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-09-14 21:52:19 +02:00