[Ocfs2-devel] Ocfs2-devel Digest, Vol 167, Issue 20

Gang He ghe at suse.com
Sun Jan 14 19:03:29 PST 2018


Hi Zhonghua,

>From you describe, the issue is reproducible.
I suggest you to trigger a core dump when you find the threads meet a dead lock, the crash dump is most convincing.
Then, you can use crash command to list the related threads/backtraces/lock variables in the patch description.
it is easy to let the people know where we meet this dead-lock when code review, 
and it is easy to let the users find your patch to fix this bug in the future.  


Thanks
Gang   


>>> 
> Hi Guozhonghua,
> 
> It seems that deadlock could be reproduced easily, right? Sharing the
> lock with VFS-layer probably is risky, and introducing a new lock for
> "quota_recovery" sounds good. Could you post a patch to fix this
> problem?
> 
> thanks,
> Jun
> 
> On 2018/1/13 11:04, Guozhonghua wrote:
>> 
>>> Message: 1
>>> Date: Fri, 12 Jan 2018 06:15:01 +0000
>>> From: Shichangkuo <shi.changkuo at h3c.com>
>>> Subject: Re: [Ocfs2-devel] [Ocfs2-dev] BUG: deadlock with umount and
>>> 	ocfs2 workqueue triggered by ocfs2rec thread
>>> To: Joseph Qi <jiangqi903 at gmail.com>, "zren at suse.com" <zren at suse.com>,
>>> 	"jack at suse.cz" <jack at suse.cz>
>>> Cc: "ocfs2-devel at oss.oracle.com" <ocfs2-devel at oss.oracle.com>
>>> Message-ID:
>>> 	<D1E4D02760513D4B90DC3B40FF32AF35E22EC513 at H3CMLB14-EX.srv.huawe 
>>> i-3com.com>
>>>
>>> Content-Type: text/plain; charset="gb2312"
>>>
>>> Hi Joseph
>>>     Thanks for replying.
>>>     Umount will flush the ocfs2 workqueue in function
>>> ocfs2_truncate_log_shutdown and journal recovery is one work of ocfs2 wq.
>>>
>>> Thanks
>>> Changkuo
>>>
>> 
>> Umount 
>>   mngput
>>    cleanup_mnt 
>>  	 deactivate_super:   down_write the rw_semaphore:  down_write(&s->s_umount)
>> 	    deactivate_locked_super
>>  		  kill_sb: kill_block_super
>> 			generic_shutdown_super
>> 				put_super : ocfs2_put_supe
>> 					ocfs2_dismount_volume
>> 						ocfs2_truncate_log_shutdown 
>> 							flush_workqueue(osb->ocfs2_wq);
>> 								ocfs2_finish_quota_recovery
>> 									down_read(&sb->s_umount); 
>> 									 Here retry down_read rw_semaphore; down read while holding write?  Try 
> rw_semaphore twice, dead lock ?
>> 	                                 The flush work queue of ocfs2_wq will be 
> blocked, so as the umount ops.  
>> 
>> Thanks. 
>> 
>> 
>> _______________________________________________
>> Ocfs2-devel mailing list
>> Ocfs2-devel at oss.oracle.com 
>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel 
>> 
> 
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com 
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel




More information about the Ocfs2-devel mailing list