[Ocfs2-devel] Can recovery be done in process context (as opposed to kthread)?

Goldwyn Rodrigues rgoldwyn at gmail.com
Fri Sep 9 15:22:49 PDT 2011


Hi,

I finally got back to improve the recovery procedure by offloading
work to work queues. However, I would like to know if we can
completely do away with ocfs2rec kthread. The process would just mark
the nodes which need recovery and offload the work on the work queues
and wait until all is over.

The reason for doing it this way is to make the mount process
killable. Currently the dlm locks are taken by ocfs2rec kthread while
the mount waits in uninterruptible sleep while the recovery happens.

This would help the High Availability software which send signals to
mount procedure if it does not complete within timeout. This usually
happens when journal takes a long time to replay; especially for nodes
waiting for recovery to complete and not doing the actual recovery.

Consider one node down procedure in the middle of I/O on a mounted
system as well.

We could keep the kthread with co-ordination as well.

-- 
Goldwyn



More information about the Ocfs2-devel mailing list