[rds-devel] that patch to teardown IB resources before RDS conn resources

Zach Brown zach.brown at oracle.com
Thu Nov 11 10:34:45 PST 2010


Heya Andy,

Here's a commented version of that patch to tear down the IB resources
before the RDS resources that the IB callbacks use.

- z

-----

RDS/IB: tear down IB resources before RDS conn resources

RDS IB connection tear down needs to guarantee that there will be no
more work request completion callbacks before it tears down the
resources that the callback handlers use.  Currently it assumes that
there will be no more completion  callbacks once all signaled work
requests have completed.

That, apparently, is wrong.  Non-signaled IB work requests can raise
error completions and end up calling our completion handlers.  We were
seeing bugs that suggest this: event handlers were crashing as they
tried to work with freed resources.

This patch was an attempt to truly guarantee that there will be no more
completion handlers called by tearing down the source of the callbacks
themselves.  It destroys the QPs, CM IDs, and CQs for the connection
before freeing the resources that the callbacks need.  The assumption is
that the destruction calls will not return until pending callbacks have
completed.

It seems to work.  We haven't seen the crashes.  But, as far as I know,
no one who actually understands the rules of the IB stack has reviewed
this.

Signed-off-by: Zach Brown <zach.brown at oracle.com>

diff --git a/net/rds/ib_cm.c b/net/rds/ib_cm.c
--- a/net/rds/ib_cm.c
+++ b/net/rds/ib_cm.c
@@ -742,6 +742,16 @@ void rds_ib_conn_shutdown(struct rds_con
 		tasklet_kill(&ic->i_stasklet);
 		tasklet_kill(&ic->i_rtasklet);
 
+		/* first destroy the ib state that generates callbacks */
+		if (ic->i_cm_id->qp)
+			rdma_destroy_qp(ic->i_cm_id);
+		if (ic->i_rcq)
+			ib_destroy_cq(ic->i_rcq);
+		if (ic->i_scq)
+			ib_destroy_cq(ic->i_scq);
+		rdma_destroy_id(ic->i_cm_id);
+
+		/* then free the resources that ib callbacks use */
 		if (ic->i_send_hdrs)
 			ib_dma_free_coherent(dev,
 					   ic->i_send_ring.w_nr *
@@ -765,14 +775,6 @@ void rds_ib_conn_shutdown(struct rds_con
 		if (ic->i_recvs)
 			rds_ib_recv_clear_ring(ic);
 
-		if (ic->i_cm_id->qp)
-			rdma_destroy_qp(ic->i_cm_id);
-		if (ic->i_rcq)
-			ib_destroy_cq(ic->i_rcq);
-		if (ic->i_scq)
-			ib_destroy_cq(ic->i_scq);
-		rdma_destroy_id(ic->i_cm_id);
-
 		/*
 		 * Move connection back to the nodev list.
 		 */






More information about the rds-devel mailing list