[Ocfs2-devel] [RFC] Doubt about dlm_worker

Joseph Qi joseph.qi at huawei.com
Sun Sep 6 06:11:53 PDT 2015


Comments for dlm_dispatch_work is described below:
/* Worker function used during recovery. */

But actually dlm_worker is used by 4 types of dlm message workers:
	dlm_assert_master_worker
	dlm_deref_lockres_worker
	dlm_request_all_locks_worker
	dlm_mig_lockres_worker

And the first 2 are not dlm recovery related. Moreover, it will send
DLM_ASSERT_MASTER_MSG to all other nodes in dlm_assert_master_worker.
And it may do a lot of assert master during recovery. In our scenario,
it is tens of thousands.
This will delay the recovery because dlm_worker is a single thread
workqueue and cluster is hanging during dlm recovery.
So I doubt if we can move the assert master to a new workqueue or just
use a system workqueue.
Any suggestions?




More information about the Ocfs2-devel mailing list