[Ocfs2-devel] A deadlock when system do not has sufficient memory
Xue jiufei
xuejiufei at huawei.com
Tue Aug 26 18:57:38 PDT 2014
Hi, Sunil
On 2014/8/26 1:13, Sunil Mushran wrote:
> On Sun, Aug 24, 2014 at 11:05 PM, Joseph Qi <joseph.qi at huawei.com <mailto:joseph.qi at huawei.com>> wrote:
>
> On 2014/8/25 13:45, Sunil Mushran wrote:
> > Please could you expand on that.
> >
> In our scenario, one node can mount multiple volumes across the
> cluster.
> For instance, N1 has mounted ocfs2 volumes say volume1, volume2,
> volume3. And volume3 may do umount/mount during runtime of other
> volumes.
>
>
> I meant expand on the deadlock. Say we are mounting a new volume and that triggers a inode cleanup. That inode being cleaned up will have to be from one of the mounted volumes. How can this lead to a deadlock?
>
> Two variations:
> a) Node death leading to recovery during the mount.
> b) Mount atop a mount.
>
> But I cannot still see a deadlock in either scenario.
The deadlock situation is just the same as the I described in my first mail.
o2net_wq
-> dlm_query_region_handler
-> kmalloc(no sufficient memory)
-> triggers ocfs2 inodes cleanup
-> ocfs2_drop_lock
-> call o2net_send_message to send unlock message
-> wait_event(nsw.ns_wq, o2net_nsw_completed(nn, &nsw))
to wait for the reply from master
-> tcp layer receive the reply, call o2net_data_ready
-> queue sc_rx_work, but o2net_wq cannot handle this work
so it triggers the deadlock, o2net_wq is waiting itself to
handle unlock reply and complete the nsw.
Thanks.
Xuejiufei
More information about the Ocfs2-devel
mailing list