[Ocfs2-devel] [RFC] Doubt about dlm_worker

Joseph Qi joseph.qi at huawei.com
Thu Sep 10 04:49:45 PDT 2015


Hi Junxiao & Sunil,
Your comments would be appreciated.

Thanks,
Joseph

On 2015/9/6 21:11, Joseph Qi wrote:
> Comments for dlm_dispatch_work is described below:
> /* Worker function used during recovery. */
> 
> But actually dlm_worker is used by 4 types of dlm message workers:
> 	dlm_assert_master_worker
> 	dlm_deref_lockres_worker
> 	dlm_request_all_locks_worker
> 	dlm_mig_lockres_worker
> 
> And the first 2 are not dlm recovery related. Moreover, it will send
> DLM_ASSERT_MASTER_MSG to all other nodes in dlm_assert_master_worker.
> And it may do a lot of assert master during recovery. In our scenario,
> it is tens of thousands.
> This will delay the recovery because dlm_worker is a single thread
> workqueue and cluster is hanging during dlm recovery.
> So I doubt if we can move the assert master to a new workqueue or just
> use a system workqueue.
> Any suggestions?
> 
> 
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
> 
> 





More information about the Ocfs2-devel mailing list