[rds-devel] QP error event with RDMA.

Olaf Kirch olaf.kirch at oracle.com
Mon Apr 28 12:30:28 PDT 2008


Hi Pradeep,

On Monday 28 April 2008 20:58:48 Pradeep wrote:
> I'm hitting an unhandled QP event with RDMA test:
> rds-stress -r 10.0.0.54 -s 10.0.0.53 -p 4000 -t1 -d4 -D524288  -T20
> 
> RDS/ib: unhandled QP event 3 on connection to 10.0.0.54
> RDS/IB: recv completion on 10.0.0.54 had status 5, disconnecting and 
> reconnecting
> 
> Remote side:
> RDS/IB: send completion on 10.0.0.53 had status 12, disconnecting and 
> reconnecting
> 
> Any idea why RDS is getting QP event 3 (IB_EVENT_QP_ACCESS_ERR)?

Yes, this usually means the R_Key was not valid (happens if your stack
is buggy, or if the process that gave you the R_Key originally died in
the meantime).

I guess we should probably print a more intelligent message than "unhandled
QP event 3" here...

Olaf

> 
> Config:
> OFED: 
> http://www.openfabrics.org/downloads/OFED/ofed-1.3-daily/OFED-1.3-20080408-0623.tgz
> Systems: Two Intel(R) Xeon systems.
> OS version: Red Hat Enterprise Linux AS release 4
> 2.6.9-42.0.10.ELsmp #1 SMP Fri Feb 16 17:17:21 EST 2007 i686 i686 i386 
> GNU/Linux
> IB Hardware: ConnectX HCA (only one port connected).
> 
> RDS config:
> max_send_wr              128
> max_recv_wr             1024
> max_unsignaled_bytes 8388608
> max_unacked_packets       16
> 
> [root at iblp0053 tmp]# rds-info
> 
> TCP Connections:
>       LocalAddr LPort      RemoteAddr RPort  HdrRemain DataRemain    SentNx
> t  ExpectUna    SeenUna
> 
> Counters:
>               CounterName            Value
>                conn_reset                4
>    recv_drop_bad_checksum                0
>         recv_drop_old_seq                0
>         recv_drop_no_sock                2
>       recv_drop_dead_sock                0
>        recv_deliver_raced                0
>            recv_delivered            38479
>               recv_queued            38481
>      recv_immediate_retry                0
>        recv_delayed_retry                0
>         recv_ack_required             9478
>           recv_rdma_bytes        747110400
>          send_queue_empty              121
>           send_queue_full                0
>       send_sem_contention                0
>      send_sem_queue_raced                0
>      send_immediate_retry                0
>        send_delayed_retry                0
>           send_drop_acked                0
>         send_ack_required             9443
>                 send_rdma            19234
>           send_rdma_bytes        747110400
>        page_remainder_hit            32069
>       page_remainder_miss             6412
>        cong_update_queued                0
>      cong_update_received                1
>           cong_send_error                0
>         cong_send_blocked                0
>          ib_connect_raced                3
>    ib_listen_closed_stale                0
>             ib_tx_cq_call            52082
>            ib_tx_cq_event            67005
>           ib_tx_ring_full                0
>  ib_tx_sg_mapping_failure                0
>             ib_tx_stalled                0
>             ib_rx_cq_call            38837
>            ib_rx_cq_event            47764
>          ib_rx_ring_empty                0
>      ib_rx_refill_from_cq            38835
>  ib_rx_refill_from_thread                2
>         ib_rx_alloc_limit                0
>               ib_ack_sent             9287
>       ib_ack_send_failure                0
>       ib_ack_send_delayed              389
>           ib_ack_received             9279
>          ib_rdma_mr_alloc              207
>           ib_rdma_mr_free                0
>           ib_rdma_mr_used            19237
>     ib_rdma_mr_pool_flush              186
>      ib_rdma_mr_pool_wait                0
>      tcp_data_ready_calls                0
>     tcp_write_space_calls                0
>           tcp_sndbuf_full                0
>         tcp_connect_raced                0
>   tcp_listen_closed_stale                0
> 
> RDS Sockets:
>       BoundAddr BPort        ConnAddr CPort     SndBuf     RcvBuf
>         0.0.0.0     0         0.0.0.0     0      55296      55296
> 
> RDS Connections:
>       LocalAddr      RemoteAddr           NextTX           NextRX Flg
>       10.0.0.53       10.0.0.54            38484            38484 --C
> 
> Receive Message Queue:
>       LocalAddr LPort      RemoteAddr RPort              Seq      Bytes
> 
> Send Message Queue:
>       LocalAddr LPort      RemoteAddr RPort              Seq      Bytes
> 
> Retransmit Message Queue:
>       LocalAddr LPort      RemoteAddr RPort              Seq      Bytes
> 
> [root at iblp0054 ~]# rds-info
> 
> TCP Connections:
>       LocalAddr LPort      RemoteAddr RPort  HdrRemain DataRemain    
> SentNxt  ExpectUna    SeenUna
> 
> Counters:
>               CounterName            Value
>                conn_reset               11
>    recv_drop_bad_checksum                0
>         recv_drop_old_seq                0
>         recv_drop_no_sock                0
>       recv_drop_dead_sock                0
>        recv_deliver_raced                0
>            recv_delivered            38483
>               recv_queued            38483
>      recv_immediate_retry                0
>        recv_delayed_retry                0
>         recv_ack_required             9443
>           recv_rdma_bytes        747634688
>          send_queue_empty              120
>           send_queue_full                0
>       send_sem_contention                0
>      send_sem_queue_raced                0
>      send_immediate_retry                0
>        send_delayed_retry                0
>           send_drop_acked                0
>         send_ack_required             9479
>                 send_rdma            19237
>           send_rdma_bytes        748158976
>        page_remainder_hit            32072
>       page_remainder_miss             6413
>        cong_update_queued                0
>      cong_update_received                2
>           cong_send_error                0
>         cong_send_blocked                0
>          ib_connect_raced                2
>    ib_listen_closed_stale                0
>             ib_tx_cq_call            50886
>            ib_tx_cq_event            67011
>           ib_tx_ring_full                0
>  ib_tx_sg_mapping_failure                0
>             ib_tx_stalled               14
>             ib_rx_cq_call            38579
>            ib_rx_cq_event            47978
>          ib_rx_ring_empty                2
>      ib_rx_refill_from_cq            38579
>  ib_rx_refill_from_thread                2
>         ib_rx_alloc_limit                0
>               ib_ack_sent             9279
>       ib_ack_send_failure                0
>       ib_ack_send_delayed              542
>           ib_ack_received             9287
>          ib_rdma_mr_alloc              207
>           ib_rdma_mr_free                0
>           ib_rdma_mr_used            19238
>     ib_rdma_mr_pool_flush              188
>      ib_rdma_mr_pool_wait                0
>      tcp_data_ready_calls                0
>     tcp_write_space_calls                0
>           tcp_sndbuf_full                0
>         tcp_connect_raced                0
>   tcp_listen_closed_stale                0
> 
> RDS Sockets:
>       BoundAddr BPort        ConnAddr CPort     SndBuf     RcvBuf
>         0.0.0.0     0         0.0.0.0     0      55296      55296
> 
> RDS Connections:
>       LocalAddr      RemoteAddr           NextTX           NextRX Flg
>       10.0.0.54       10.0.0.53            38488            38484 --C
> 
> Receive Message Queue:
>       LocalAddr LPort      RemoteAddr RPort              Seq      Bytes
> 
> Send Message Queue:
>       LocalAddr LPort      RemoteAddr RPort              Seq      Bytes
> 
> Retransmit Message Queue:
>       LocalAddr LPort      RemoteAddr RPort              Seq      Bytes
> 
> 
> Thanks,
> Pradeep
> 
> _______________________________________________
> rds-devel mailing list
> rds-devel at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/rds-devel
> 



-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
okir at lst.de |    / | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax



More information about the rds-devel mailing list