[rds-devel] RDS hanging in send queue
Mike Heinz
michael.heinz at qlogic.com
Wed May 27 10:07:56 PDT 2009
Hey, all -
Got a report from one of our testers that rds-ping was failing between two machines. When I went to check them out, I found that things are piling up in the send queue (see below) and /var/log/messages was flooded with thousands of copies of the error:
May 27 10:23:18 st2031 kernel: RDS/IB: rdma_accept failed (-22)
Restarting the rds module on both machines has no effect. Having the machines ping themselves has no effect.
Any suggestions? Below this line is a trimmed copy of rds-info from the same machine:
---------------------------------------------------------------------------
RDS IB Connections:
LocalAddr RemoteAddr LocalDev RemoteDev
172.26.137.51 172.26.137.49 :: ::
Counters:
CounterName Value
conn_reset 20727
(trimmed lines where value was zero)
send_queue_empty 201
(trimmed lines where value was zero)
send_queued 6957
(trimmed lines where value was zero)
ib_connect_raced 45
(trimmed lines where value was zero)
ib_rdma_mr_pool_flush 40
RDS Sockets:
BoundAddr BPort ConnAddr CPort SndBuf RcvBuf Inode
172.26.137.51 74 0.0.0.0 0 8388608 8388608 16870
172.26.137.51 51741 0.0.0.0 0 8388608 8388608 16872
172.26.137.51 29215 0.0.0.0 0 8388608 8388608 16873
172.26.137.51 19359 0.0.0.0 0 8388608 8388608 16874
172.26.137.51 5841 0.0.0.0 0 8388608 8388608 16875
172.26.137.51 13520 0.0.0.0 0 8388608 8388608 16876
172.26.137.51 43209 0.0.0.0 0 8388608 8388608 16877
172.26.137.51 46564 0.0.0.0 0 8388608 8388608 16878
0.0.0.0 0 0.0.0.0 0 8388608 8388608 22533
RDS Connections:
LocalAddr RemoteAddr NextTX NextRX Flg
172.26.137.51 172.26.137.49 6958 0 ---
Receive Message Queue:
LocalAddr LPort RemoteAddr RPort Seq Bytes
Send Message Queue:
LocalAddr LPort RemoteAddr RPort Seq Bytes
172.26.137.51 74 172.26.137.49 0 146 0
172.26.137.51 51741 172.26.137.49 0 147 0
172.26.137.51 29215 172.26.137.49 0 148 0
172.26.137.51 19359 172.26.137.49 0 149 0
172.26.137.51 5841 172.26.137.49 0 150 0
172.26.137.51 13520 172.26.137.49 0 151 0
172.26.137.51 43209 172.26.137.49 0 152 0
172.26.137.51 46564 172.26.137.49 0 153 0
172.26.137.51 74 172.26.137.49 0 154 0
172.26.137.51 51741 172.26.137.49 0 155 0
172.26.137.51 29215 172.26.137.49 0 156 0
172.26.137.51 19359 172.26.137.49 0 157 0
172.26.137.51 5841 172.26.137.49 0 158 0
172.26.137.51 13520 172.26.137.49 0 159 0
172.26.137.51 43209 172.26.137.49 0 160 0
172.26.137.51 46564 172.26.137.49 0 161 0
.
.
.
.
(trimmed remaining records, they only repeat the pattern shown above)
.
.
.
Retransmit Message Queue:
LocalAddr LPort RemoteAddr RPort Seq Bytes
--
Michael Heinz
Principal Engineer, Qlogic Corporation
King of Prussia, Pennsylvania
More information about the rds-devel
mailing list