[rds-devel] continuous recvmsg(MSG_DONTWAIT) EAGAIN for IB RDMA-read notify_me?

Milind Dumbare milind at linux.com
Sun Sep 21 02:18:00 PDT 2014


Hi Smith,

Appearlty the rds-sample.c is broken for current stack. Here is the fix I
got which should make rds-sample.c work.

 *        do {*

*                rc = recvmsg(sock, &msg, 0);*

*                if (rc < 0) {*

*                        printf("%s: Error receiving message: %d %d\n",
__func__, rc, errno);*

*                        goto out2;*

*                }*



*                if (flags & VERBOSE_FLAG)*

*                        printf("Received %s packet %d of len %d, cmsg len
%d, on port %d\n",*

*                               msg.msg_controllen ? "RDS RDMA" : "RDS",*

*                               count,*

*                               (uint32_t) iov[0].iov_len,*

*                               (uint32_t) msg.msg_controllen,*

*                               din.sin_port);*

*+++                if (msg.msg_namelen == 0)*

*+++                        continue;*

*+++                msg.msg_namelen = sizeof(din);*


-Milind

On Wed, Sep 17, 2014 at 11:33 PM, Smith, Stan <stan.smith at intel.com> wrote:

>  Hello,
>
>   Pardon my 1st time post omissions which I may be unaware of.
>
>
>
> RHEL 6.5 (2.6.32-431-29.2) provided Infiniband stack over Mellanox IB
> hardware (mlx4), distro provided RDS stack.
>
> rds-sample.c (rds-tools-2.0.6) works as expected for non-RDMA operations.
>
> RDMA ops never complete?
>
> With –rr (RDMA-read) the client successfully sends the RDS_CMSG_RDMA_MAP
> control message over an RDS socket bound to the IPoIB IPv4 address and
> waits for server RDMA ack.
>
> Server correctly receives RDS_CMSG_RDMA_MAP message from an rds socket
> bound to the IPoIB IF IPv4 address (local loopback) and returns
> RDS_CMSG_RDMA_DEST control message with RDMA cookie as CMSG_DATA(msg) in
> response to readmsg().
>
> Server casts CMSG_DATA(msg) ptr to ‘(struct rds_rdma_args*)’ overlaying
> cookie and fills in the remaining struct rds_rdma_args fields describing
> the RDMA-read op.
> rds_rdma_args.flags = RDS_RDMA_FENCE | RDS_RDMA_NOTIFY_ME;
>
> Server calls sendmsg() to post the RDMA-read operation and gets a
> successful sendmsg() return.
>
> Server issues recvmsg(sock, msg, MSG_DONTWAIT) same msg struct as used in
> previous sendmsg().
>
> Server loops forever in
>
>         do {
>
>                 rc = recvmsg(sock, msg, MSG_DONTWAIT);
>
>         } while (rc < 0 && errno == EAGAIN);
>
> When a bailout condition is inserted into the loop is encountered (waited
> 7 seconds) the expected RDMA-read data has not appeared; suggests the
> RDMA-read op was never posted or completed successfully.
>
>
>
> Questions
>
> 1)      Does anyone have a pointer to a known working version of
> rds-sample.c for RDMA operations?
>
> 2)      Thoughts on why/how to move beyond the stalling do {} while()?
>
>
>
> Thank you,
>
>
>
> Stan.
>
> _______________________________________________
> rds-devel mailing list
> rds-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/rds-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/rds-devel/attachments/20140921/0b31a3d0/attachment.html 


More information about the rds-devel mailing list