[rds-devel] [PATCH net-next] rds: avoid lock hierarchy violation between m_rs_lock and rs_recv_lock

David Miller davem at davemloft.net
Sat Aug 11 11:22:25 PDT 2018


From: Sowmini Varadhan <sowmini.varadhan at oracle.com>
Date: Wed,  8 Aug 2018 13:57:13 -0700

> The following deadlock, reported by syzbot, can occur if CPU0 is in
> rds_send_remove_from_sock() while CPU1 is in rds_clear_recv_queue()
> 
>        CPU0                    CPU1
>        ----                    ----
>   lock(&(&rm->m_rs_lock)->rlock);
>                                lock(&rs->rs_recv_lock);
>                                lock(&(&rm->m_rs_lock)->rlock);
>   lock(&rs->rs_recv_lock);
> 
> The deadlock should be avoided by moving the messages from the
> rs_recv_queue into a tmp_list in rds_clear_recv_queue() under
> the rs_recv_lock, and then dropping the refcnt on the messages
> in the tmp_list (potentially resulting in rds_message_purge())
> after dropping the rs_recv_lock.
> 
> The same lock hierarchy violation also exists in rds_still_queued()
> and should be avoided in a similar manner
> 
> Signed-off-by: Sowmini Varadhan <sowmini.varadhan at oracle.com>
> Reported-by: syzbot+52140d69ac6dc6b927a9 at syzkaller.appspotmail.com

I'm putting this in deferred state for now.

Sowmini, once you and Santosh agree on what exactly to do, please
resubmit.

Thank you.



More information about the rds-devel mailing list