35b52c7053
On second look at this bug (OFED #2002), it seems that the collision is not with the retransmission queue (packet acked by the peer), but with the local send completion. A theoretical sequence of events (from time t0 to t3) is thought to be as follows, Thread #1 t0: sock_release rds_release rds_send_drop_to /* wait on send completion */ t2: rds_rdma_drop_keys() /* destroy & free all mrs */ Thread #2 t1: rds_ib_send_cq_comp_handler rds_ib_send_unmap_rm rds_message_unmapped /* wake up #1 @ t0 */ t3: rds_message_put rds_message_purge rds_mr_put /* memory corruption detected */ The problem with the rds_rdma_drop_keys() is it could remove a mr's refcount more than its due (i.e. repeatedly as long as it still remains in the tree (mr->r_refcount > 0)). Theoretically it should remove only one reference - reference by the tree. /* Release any MRs associated with this socket */ while ((node = rb_first(&rs->rs_rdma_keys))) { mr = container_of(node, struct rds_mr, r_rb_node); if (mr->r_trans == rs->rs_transport) mr->r_invalidate = 0; rds_mr_put(mr); } I think the correct way of doing it is to remove the mr from the tree and rds_destroy_mr it first, then a rds_mr_put() to decrement its reference count by one. Whichever thread holds the last reference will free the mr via rds_mr_put(). Signed-off-by: Tina Yang <tina.yang@oracle.com> Signed-off-by: Andy Grover <andy.grover@oracle.com> |
||
---|---|---|
.. | ||
af_rds.c | ||
bind.c | ||
cong.c | ||
connection.c | ||
ib_cm.c | ||
ib_rdma.c | ||
ib_recv.c | ||
ib_ring.c | ||
ib_send.c | ||
ib_stats.c | ||
ib_sysctl.c | ||
ib.c | ||
ib.h | ||
info.c | ||
info.h | ||
iw_cm.c | ||
iw_rdma.c | ||
iw_recv.c | ||
iw_ring.c | ||
iw_send.c | ||
iw_stats.c | ||
iw_sysctl.c | ||
iw.c | ||
iw.h | ||
Kconfig | ||
loop.c | ||
loop.h | ||
Makefile | ||
message.c | ||
page.c | ||
rdma_transport.c | ||
rdma_transport.h | ||
rdma.c | ||
rdma.h | ||
rds.h | ||
recv.c | ||
send.c | ||
stats.c | ||
sysctl.c | ||
tcp_connect.c | ||
tcp_listen.c | ||
tcp_recv.c | ||
tcp_send.c | ||
tcp_stats.c | ||
tcp.c | ||
tcp.h | ||
threads.c | ||
transport.c |