[rds-devel] RDS - resource leakage - recv_ring counters - looks buggy

Andy Grover andy.grover at oracle.com
Fri May 22 16:30:55 PDT 2009


Viral Mehta wrote:
> I am, somehow, not able to forward this to netdev list.
> 
> When we run any rds-ping test, it creates a connection. It sets up
> QP. And then it posts (1024 or whatever mentioned through sysctl)
> recvs. Basically these are pre-post recvs. Extra recvs will be posted
> again if it goes below low-watermark.
> 
> By design, Connection and all RDMA resources are destroyed only when
> module is unloaded. Now in unloading process, before destroying RDMA
> resources, it  waits till all send_ring/recv_ring becomes empty. 
> =========== iw_cm.c:600:
> wait_event(rds_iw_ring_empty_wait, iw_cm.c-601-
> rds_iw_ring_empty(&ic->i_send_ring) && iw_cm.c-602-
> rds_iw_ring_empty(&ic->i_recv_ring)); ===========
> 
> Ring empty means diff (i.e., ring->w_alloc_ctr - ring->free_ctr)
> should be zero. w_alloc_ctr are number of posted recvs and free_ctr
> is number of recvs consumed. Ideally, this can never be zero as we
> always want some pre-posted recvs and thus recv_ring will never be
> empty.

Hi Viral, sorry for the delay in responding.

I believe what happens is that the rdma_disconnect() above the
wait_event causes all the outstanding recv wrs to be completed with an
error. This causes them to be freed. They are not refilled, and so the
ring becomes empty.

Does this analysis appear correct to you?

Thanks -- Regards -- Andy




More information about the rds-devel mailing list