1

cxl/port: Fix cxl_bus_rescan() vs bus_rescan_devices()

It turns out since its original introduction, pre-2.6.12,
bus_rescan_devices() has skipped devices that might be in the process of
attaching or detaching from their driver. For CXL this behavior is
unwanted and expects that cxl_bus_rescan() is a probe barrier.

That behavior is simple enough to achieve with bus_for_each_dev() paired
with call to device_attach(), and it is unclear why bus_rescan_devices()
took the position of lockless consumption of dev->driver which is racy.

The "Fixes:" but no "Cc: stable" on this patch reflects that the issue
is merely by inspection since the bug that triggered the discovery of
this potential problem [1] is fixed by other means.  However, a stable
backport should do no harm.

Fixes: 8dd2bc0f8e ("cxl/mem: Add the cxl_mem driver")
Link: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net [1]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://patch.msgid.link/172964781104.81806.4277549800082443769.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
This commit is contained in:
Dan Williams 2024-10-22 18:43:32 -07:00 committed by Ira Weiny
parent 6575b26815
commit 3d6ebf1643

View File

@ -2084,11 +2084,18 @@ static void cxl_bus_remove(struct device *dev)
static struct workqueue_struct *cxl_bus_wq; static struct workqueue_struct *cxl_bus_wq;
static int cxl_rescan_attach(struct device *dev, void *data)
{
int rc = device_attach(dev);
dev_vdbg(dev, "rescan: %s\n", rc ? "attach" : "detached");
return 0;
}
static void cxl_bus_rescan_queue(struct work_struct *w) static void cxl_bus_rescan_queue(struct work_struct *w)
{ {
int rc = bus_rescan_devices(&cxl_bus_type); bus_for_each_dev(&cxl_bus_type, NULL, NULL, cxl_rescan_attach);
pr_debug("CXL bus rescan result: %d\n", rc);
} }
void cxl_bus_rescan(void) void cxl_bus_rescan(void)