[rds-devel] Re: comments on the send CQ completion handler

Richard Frank richard.frank at oracle.com
Tue Jan 8 07:41:47 PST 2008


Or Gerlitz wrote:
> +void rds_ib_send_unmap_rm(struct rds_ib_connection *ic,
> +		          struct rds_ib_send_work *send)
> +{
> +	rdsdebug("ic %p send %p rm %p\n", ic, send, send->s_rm);
> +
> +	dma_unmap_sg(ic->i_cm_id->device->dma_device,
> +		     send->s_rm->m_sg, send->s_rm->m_nents,
> +		     DMA_TO_DEVICE);
> +
> +	/* raise rdma completion hwm */
> +	if (send->s_rm->m_rdma_op)
> +		rds_barrier_update(send->s_rm, send->s_rm->m_rdma_op->r_rdma_id);
> +	rds_message_put(send->s_rm);
> +	send->s_rm = NULL;
> +}
>
> +void rds_ib_send_cq_comp_handler(struct ib_cq *cq, void *context)
> +{
> ...
> +		completed = rds_ib_ring_completed(&ic->i_send_ring, wc.wr_id, oldest);
> +
> +		for (i = 0; i < completed; i++) {
> +			if (wc.opcode == IB_WC_SEND) {
> +				if (send->s_rm)
> +					rds_ib_send_unmap_rm(ic, send);
> +			}
>
> Actually, I started to look here, since I did not understand how the barriers design
> goes hand in hand with the selective send CQ signaling, that is the method of asking
> for completion for 1 out of m calls to ib_post_send.
>
> My understanding of the code is that for rm instance (rds_message) that represents
> the (possibly empty) message which is provided as the immediate data of rdma
> operation the m_rdma_op field points to the rdma op descriptor (am I correct?) and
> when rds sense the completion of this rm instance it would signal the barrier, correct?
>
> This suggests a possible live-lock or what ever you may call it, when the rdma
> and/or the send following it did not set the IB_SEND_SIGNALED bit in the send flags
> but there's no further outgoing traffic and the user calls for barrier.
>   
If I understand your point - we need to make sure to signal completion 
on send side - for the immediate data and or the rdma completing - to 
ensure we can update the barrier - otherwise the client waiting on a 
barrier may get stuck - until it issues a subsequent rdma operation 
(which hopefully is signaled).

All send operations should have signaled set as the last fragment is 
xmitted - and may have it set for one or more of the intervening 
fragments - based on the number of fragments sent for a message.

This raises another point - we do not need signaling of rdma completions 
- as all rdmas are always followed with the immediate data send - which 
should be signaled. We use the immediate data send completion to set the 
barrier  hwm .


> Or.
>
> _______________________________________________
> rds-devel mailing list
> rds-devel at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/rds-devel
>   



More information about the rds-devel mailing list