[rds-devel] [PATCH] RDS/IB+IW: Another stall/shutdown hang fix.
Steve Wise
swise at opengridcomputing.com
Thu Jan 29 13:48:15 PST 2009
From: Steve Wise <swise at opengridcomputing.com>
Currently if an RDS ACK is posted and there are unsignaled SEND wrs
pending in the send ring, then the completion of the RDS ACK will _not_
reap those unsignaled SENDS from the send ring. If no more data is sent
on that connection and the connection is shutdown (like via a rmmod)
then the shutdown will hang. I also see a flow control deadlock in this
same state.
The solution is to always make the last send wr posted signaled before
flow controlling the sender.
Signed-off-by: Steve Wise <swise at opengridcomputing.com>
---
drivers/infiniband/ulp/rds/ib_send.c | 9 +++++++++
drivers/infiniband/ulp/rds/iw_send.c | 9 +++++++++
2 files changed, 18 insertions(+), 0 deletions(-)
diff --git a/drivers/infiniband/ulp/rds/ib_send.c b/drivers/infiniband/ulp/rds/ib_send.c
index 20af976..3b41647 100644
--- a/drivers/infiniband/ulp/rds/ib_send.c
+++ b/drivers/infiniband/ulp/rds/ib_send.c
@@ -471,6 +471,7 @@ int rds_ib_xmit(struct rds_connection *conn, struct rds_message *rm,
int send_flags = 0;
int sent;
int ret;
+ int flow_controlled = 0;
BUG_ON(off % RDS_FRAG_SIZE);
BUG_ON(hdr_off != 0 && hdr_off != sizeof(struct rds_header));
@@ -495,6 +496,7 @@ int rds_ib_xmit(struct rds_connection *conn, struct rds_message *rm,
if (credit_alloc < work_alloc) {
rds_ib_ring_unalloc(&ic->i_send_ring, work_alloc - credit_alloc);
work_alloc = credit_alloc;
+ flow_controlled++;
}
if (work_alloc == 0) {
rds_ib_ring_unalloc(&ic->i_send_ring, work_alloc);
@@ -619,6 +621,13 @@ int rds_ib_xmit(struct rds_connection *conn, struct rds_message *rm,
send->s_wr.send_flags |= IB_SEND_SIGNALED | IB_SEND_SOLICITED;
}
+ /*
+ * Always signal the last one if we're stopping due to flow control.
+ */
+ if (flow_controlled && i == (work_alloc-1)) {
+ send->s_wr.send_flags |= IB_SEND_SIGNALED | IB_SEND_SOLICITED;
+ }
+
rdsdebug("send %p wr %p num_sge %u next %p\n", send,
&send->s_wr, send->s_wr.num_sge, send->s_wr.next);
diff --git a/drivers/infiniband/ulp/rds/iw_send.c b/drivers/infiniband/ulp/rds/iw_send.c
index fd57c68..51c1bf7 100644
--- a/drivers/infiniband/ulp/rds/iw_send.c
+++ b/drivers/infiniband/ulp/rds/iw_send.c
@@ -511,6 +511,7 @@ int rds_iw_xmit(struct rds_connection *conn, struct rds_message *rm,
int send_flags = 0;
int sent;
int ret;
+ int flow_controlled = 0;
BUG_ON(off % RDS_FRAG_SIZE);
BUG_ON(hdr_off != 0 && hdr_off != sizeof(struct rds_header));
@@ -542,6 +543,7 @@ int rds_iw_xmit(struct rds_connection *conn, struct rds_message *rm,
if (credit_alloc < work_alloc) {
rds_iw_ring_unalloc(&ic->i_send_ring, work_alloc - credit_alloc);
work_alloc = credit_alloc;
+ flow_controlled++;
}
if (work_alloc == 0) {
rds_iw_ring_unalloc(&ic->i_send_ring, work_alloc);
@@ -666,6 +668,13 @@ int rds_iw_xmit(struct rds_connection *conn, struct rds_message *rm,
send->s_wr.send_flags |= IB_SEND_SIGNALED | IB_SEND_SOLICITED;
}
+ /*
+ * Always signal the last one if we're stopping due to flow control.
+ */
+ if (flow_controlled && i == (work_alloc-1)) {
+ send->s_wr.send_flags |= IB_SEND_SIGNALED | IB_SEND_SOLICITED;
+ }
+
rdsdebug("send %p wr %p num_sge %u next %p\n", send,
&send->s_wr, send->s_wr.num_sge, send->s_wr.next);
More information about the rds-devel
mailing list