[rds-devel] RDS - resource leakage - recv_ring counters - looks buggy
Andy Grover
andy.grover at oracle.com
Wed May 27 17:58:22 PDT 2009
I'm very curious what the rdsdebug() inside the ib_poll_cq loop in
rds_iw_recv_cq_comp_handler would say. Can you turn on RDS_DEBUG, or
perhaps change that rdsdebug to a printk so we just get that one line of
output? I would guess you will see completions with errors for all
outstanding recv WRs. Can you try this and see what happens?
I'm pretty sure those WRs have to be completed *somewhere*, since as you
pointed out, otherwise we'd hang indefinitely on unload.
Thanks -- Regards -- Andy
Viral Mehta wrote:
> Hi, I am again suspicious about RDS code.
>
> I modified RDS code a little bit to confirm the same. I added a call
> to ib_cq_poll() after rdma_disconnect() call in
> rds_iw_conn_shutdown() function definition.
>
> And as expected, I got CQ completion entry with cqe_flush status
> (IB_WC_WR_FLUSH_ERR) which I am not getting in normal code which
> means ib_cq_poll() is not being called when we are in disconnect path
> (or when modprobe -r rds is done).
>
> If you can shed some light I can debug more.
>
> Viral Mehta wrote:
>> Hi Andy, Thanks for your response.
>>
>> Yes, I agree with you. rdma_disconnect should free up all SQ and RQ
>> WQEs. Also iwarp sepc confirms the same.
>>
>> So, looks like RDS has no problem. I will let you know if I find
>> something else.
>>
>> Thanks again,
>>
>> Andy Grover wrote:
>>
>>> Viral Mehta wrote:
>>>
>>>> I am, somehow, not able to forward this to netdev list.
>>>>
>>>> When we run any rds-ping test, it creates a connection. It sets
>>>> up QP. And then it posts (1024 or whatever mentioned through
>>>> sysctl) recvs. Basically these are pre-post recvs. Extra recvs
>>>> will be posted again if it goes below low-watermark.
>>>>
>>>> By design, Connection and all RDMA resources are destroyed only
>>>> when module is unloaded. Now in unloading process, before
>>>> destroying RDMA resources, it waits till all
>>>> send_ring/recv_ring becomes empty. =========== iw_cm.c:600:
>>>> wait_event(rds_iw_ring_empty_wait, iw_cm.c-601-
>>>> rds_iw_ring_empty(&ic->i_send_ring) && iw_cm.c-602-
>>>> rds_iw_ring_empty(&ic->i_recv_ring)); ===========
>>>>
>>>> Ring empty means diff (i.e., ring->w_alloc_ctr -
>>>> ring->free_ctr) should be zero. w_alloc_ctr are number of
>>>> posted recvs and free_ctr is number of recvs consumed. Ideally,
>>>> this can never be zero as we always want some pre-posted recvs
>>>> and thus recv_ring will never be empty.
>>>>
>>> Hi Viral, sorry for the delay in responding.
>>>
>>> I believe what happens is that the rdma_disconnect() above the
>>> wait_event causes all the outstanding recv wrs to be completed
>>> with an error. This causes them to be freed. They are not
>>> refilled, and so the ring becomes empty.
>>>
>>> Does this analysis appear correct to you?
>>>
>>> Thanks -- Regards -- Andy
>>>
>>>
>>>
>>> Email Scanned for Virus & Dangerous Content by :
>>> www.CleanMailGateway.com
>>>
>>>
>>>
>
More information about the rds-devel
mailing list