[rds-devel] Re: comments on the send CQ completion handler

Wed Jan 9 08:51:38 PST 2008

Olaf Kirch wrote:
> On Wednesday 09 January 2008 17:01, Or Gerlitz wrote:
>> Richard Frank wrote:
>>> All send operations should have signaled set as the last fragment is 
>>> xmitted - and may have it set for one or more of the intervening 
>>> fragments - based on the number of fragments sent for a message.

> No, after the loop there's this obscure little fragment:
> 
> 	/* if we finished the message then send completion owns it */
>         if (scat == &op->r_sg[op->r_count]) {
>                 prev->s_wr.send_flags = IB_SEND_SIGNALED | IB_SEND_SOLICITED;
>                 prev->s_op = op;
>         }

OK, got it.

>> On the other thread you wrote that "Our planned use of immediate data 
>> (notification message of rdma completion) keeps the immediate data (msg) 
>> separate from the rdma data" from which I understand you first want to 
>> get notification on the rdma completion and only after this (and 
>> possibly some more processing is done) send the immediate data, did I 
>> miss anything?

> I think there may be some confusion around the term "immediate data".
> What Rick means by that is not the same the IB spec means when it
> talks about eg RDMA writes with immediate data.
> 
> When the application triggers an RDMA transfer, it does a sendmsg -
> a normal message with some ancillary information that triggers the
> RDMA. The ancillary information is passed via a socket control message
> (msghdr.msg_control).
> 
> The kernel puts this into one rds_message, and attaches the rdma_op
> to it.
> 
> The whole thing is being sent as
> 
>  -	first we queue up the RDMA transfer via rds_ib_xmit_rdma
>  -	then we send the message (what Rick refers to as immediate
> 	data) via a normal SEND; but without waiting for its completion

> Did this help?

I understand that, but anyway thanks for clarifying. The thing is that I 
understand that

(A) sending the immediate data message is optional, that is the app can 
tell rds "don't send immediate data following the rdma"

(B) the planned usage Rick talks about indeed does not send such 
immediate data

so with my thinking of RDS possibly posting the rdma send without asking 
for completion to be generated I came int the conclusion that some sort 
of deadlock is possible. However, you claim that there will always be 
completion for the rdma, correct?

Or.

> 
> Olaf