[rds-devel] Re: comments on the send CQ completion handler

Tue Jan 8 06:59:21 PST 2008

+void rds_ib_send_unmap_rm(struct rds_ib_connection *ic,
+		          struct rds_ib_send_work *send)
+{
+	rdsdebug("ic %p send %p rm %p\n", ic, send, send->s_rm);
+
+	dma_unmap_sg(ic->i_cm_id->device->dma_device,
+		     send->s_rm->m_sg, send->s_rm->m_nents,
+		     DMA_TO_DEVICE);
+
+	/* raise rdma completion hwm */
+	if (send->s_rm->m_rdma_op)
+		rds_barrier_update(send->s_rm, send->s_rm->m_rdma_op->r_rdma_id);
+	rds_message_put(send->s_rm);
+	send->s_rm = NULL;
+}

+void rds_ib_send_cq_comp_handler(struct ib_cq *cq, void *context)
+{
...
+		completed = rds_ib_ring_completed(&ic->i_send_ring, wc.wr_id, oldest);
+
+		for (i = 0; i < completed; i++) {
+			if (wc.opcode == IB_WC_SEND) {
+				if (send->s_rm)
+					rds_ib_send_unmap_rm(ic, send);
+			}

Actually, I started to look here, since I did not understand how the barriers design
goes hand in hand with the selective send CQ signaling, that is the method of asking
for completion for 1 out of m calls to ib_post_send.

My understanding of the code is that for rm instance (rds_message) that represents
the (possibly empty) message which is provided as the immediate data of rdma
operation the m_rdma_op field points to the rdma op descriptor (am I correct?) and
when rds sense the completion of this rm instance it would signal the barrier, correct?

This suggests a possible live-lock or what ever you may call it, when the rdma
and/or the send following it did not set the IB_SEND_SIGNALED bit in the send flags
but there's no further outgoing traffic and the user calls for barrier.

Or.