[rds-devel] continuous recvmsg(MSG_DONTWAIT) EAGAIN for IB RDMA-read notify_me?
Smith, Stan
stan.smith at intel.com
Mon Sep 22 09:44:16 PDT 2014
Hello,
Thanks for the suggestion. Unfortunately the fix did not resolve any of my observed problems.
I suspect the open source RDS stack and rds-simple.c have long parted ways……
Stan.
From: milind.dumbare at gmail.com [mailto:milind.dumbare at gmail.com] On Behalf Of Milind Dumbare
Sent: Sunday, September 21, 2014 2:18 AM
To: Smith, Stan
Cc: rds-devel at oss.oracle.com
Subject: Re: [rds-devel] continuous recvmsg(MSG_DONTWAIT) EAGAIN for IB RDMA-read notify_me?
Hi Smith,
Appearlty the rds-sample.c is broken for current stack. Here is the fix I got which should make rds-sample.c work.
do {
rc = recvmsg(sock, &msg, 0);
if (rc < 0) {
printf("%s: Error receiving message: %d %d\n", __func__, rc, errno);
goto out2;
}
if (flags & VERBOSE_FLAG)
printf("Received %s packet %d of len %d, cmsg len %d, on port %d\n",
msg.msg_controllen ? "RDS RDMA" : "RDS",
count,
(uint32_t) iov[0].iov_len,
(uint32_t) msg.msg_controllen,
din.sin_port);
+++ if (msg.msg_namelen == 0)
+++ continue;
+++ msg.msg_namelen = sizeof(din);
-Milind
On Wed, Sep 17, 2014 at 11:33 PM, Smith, Stan <stan.smith at intel.com<mailto:stan.smith at intel.com>> wrote:
Hello,
Pardon my 1st time post omissions which I may be unaware of.
RHEL 6.5 (2.6.32-431-29.2) provided Infiniband stack over Mellanox IB hardware (mlx4), distro provided RDS stack.
rds-sample.c (rds-tools-2.0.6) works as expected for non-RDMA operations.
RDMA ops never complete?
With –rr (RDMA-read) the client successfully sends the RDS_CMSG_RDMA_MAP control message over an RDS socket bound to the IPoIB IPv4 address and waits for server RDMA ack.
Server correctly receives RDS_CMSG_RDMA_MAP message from an rds socket bound to the IPoIB IF IPv4 address (local loopback) and returns RDS_CMSG_RDMA_DEST control message with RDMA cookie as CMSG_DATA(msg) in response to readmsg().
Server casts CMSG_DATA(msg) ptr to ‘(struct rds_rdma_args*)’ overlaying cookie and fills in the remaining struct rds_rdma_args fields describing the RDMA-read op.
rds_rdma_args.flags = RDS_RDMA_FENCE | RDS_RDMA_NOTIFY_ME;
Server calls sendmsg() to post the RDMA-read operation and gets a successful sendmsg() return.
Server issues recvmsg(sock, msg, MSG_DONTWAIT) same msg struct as used in previous sendmsg().
Server loops forever in
do {
rc = recvmsg(sock, msg, MSG_DONTWAIT);
} while (rc < 0 && errno == EAGAIN);
When a bailout condition is inserted into the loop is encountered (waited 7 seconds) the expected RDMA-read data has not appeared; suggests the RDMA-read op was never posted or completed successfully.
Questions
1) Does anyone have a pointer to a known working version of rds-sample.c for RDMA operations?
2) Thoughts on why/how to move beyond the stalling do {} while()?
Thank you,
Stan.
_______________________________________________
rds-devel mailing list
rds-devel at oss.oracle.com<mailto:rds-devel at oss.oracle.com>
https://oss.oracle.com/mailman/listinfo/rds-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/rds-devel/attachments/20140922/41a9852d/attachment.html
More information about the rds-devel
mailing list