[Ocfs2-devel] dlm_pick_recovery_master algorithm?
Daniel Phillips
phillips at google.com
Wed May 31 18:01:36 CDT 2006
Thanks Kurt, great answers!
> You wrote:
> One note on all of this: this is NOT how we would like to do recovery
> going forward, we just did not have a solid cluster membership service
> in place that we could use when the mastery/recovery code was written.
> Once we do have a stable mechanism and API (stop/start/finish) to depend
> upon, I would like to rewrite the whole thing for lock-table-based mastery
> and much more sensible recovery.
What is the pedigree of that stop/start/finish API? Is it the only stable
mechanism you know of to build a more sensible recovery on?
> As it stands, it's a brittle structure
> that has to continually try to detect node failures inline and make
> adjustments as recovery is ongoing, which is no fun.
Not to mention, slow and not obviously terminating, indeed.
Regards,
Daniel
More information about the Ocfs2-devel
mailing list