[Ocfs2-tools-devel] [PATCH] ocfs2_controld: Fix double-leave in complete_mount().
Mark Fasheh
mfasheh at suse.com
Sat Aug 16 09:14:29 PDT 2008
On Sat, Aug 16, 2008 at 02:54:50AM -0700, Joel Becker wrote:
> In commit f5032771bc41bf9ff31ed42f332d8ec8def39e55 (ocfs2_controld: Allow
> multiple real mounts.) we stopped tracking mountpoints. Instead, we
> describe different applications as "services". "tunefs.ocfs2" is one
> service. "fsck.ocfs2" is another. The actual filesystem uses the
> service "ocfs2".
>
> Only one instance of a service is allowed, except for the filesystem.
> You can, of course, have one device mounted at multiple mountpoints:
>
> # mount /dev/sdb1 /ocfs2
> # mount /dev/sdb1 /ocfs3
>
> In the special case of the "ocfs2" service, ocfs2_controld will allow
> more than one MOUNT call (send by o2cb_begin_group_join).
> The additinaly mounters get EALREADY, which libo2cb knows to interpret
> as success. It goes like this:
>
> mount.ocfs2 on /ocfs2 (first) ocfs2_controld
> ----------------------------- --------------
> o2cb_begin_group_join(uuid)
> start_join(uuid)
> finish_join(uuid)
> dlmcontrol_register(uuid)
> notify_mount_client(0)
> err = mount(dev, mntpnt)
> o2cb_complete_group_join(uuid, err)
> complete_mount(uuid, err)
>
> mount.ocfs2 on /ocfs3 (additional) ocfs2_controld
> ---------------------------------- --------------
> o2cb_begin_group_join(uuid)
> notify_client(EALREADY)
> err = mount(dev, mntpnt)
> o2cb_complete_group_join(uuid, err)
> complete_mount(uuid, err)
>
> Here's the crux of the problem. If that first mounter gets an error
> from mount(2), the daemon's complete_mount() should leave the group.
> There's no filesystem mounted.
>
> However, if the *second* mounter gets an error from mount(2) (say, a
> missing mountpoint), the daemon should not leave the group - that first
> mount is still going! That's the bug. The daemon didn't know the
> difference, and it would leave the group. The first mount was left out
> in the lurch.
>
> The fix is to mark the additional mounts as such. complete_mount()
> notices the additional flag and does nothing beyond responding to
> mount.ocfs2(8).
>
> dead_mounter() had the same problem. If an additional mounter died, it
> was treated like a first mounter. dead_mounter() now does what
> complete_mount() does. It cleans up the additional state and nothing
> else.
>
> While we're there, we've learned enough about our state to handle first
> mounts that died before sending their status to the daemon. We've
> always known that a dead_mounter() during group join could just set
> leave_on_join. But if the mount program has already been notified,
> there may be a mounted filesystem. We pinned the filesystem as busy and
> basically locked out all other operations.
>
> But as it turns out, a fully operational group is a good state. We can
> clear the in-progress flag and allow new operations. Additional mounts
> can happen cleanly, and umounts as well. If the filesystem never got
> mounted, a cleanup with ocfs2_hb_ctl is safe. It's up to the
> administrator to check safety, but it's a predictable environment.
Ok, that's good. I'm glad that we're erring on the side of caution. We can
worry about being extra-fancy later :)
The patch looks good - thanks for commenting the heck out of it too.
Signed-off-by: Mark Fasheh <mfasheh at suse.com>
--Mark
--
Mark Fasheh
More information about the Ocfs2-tools-devel
mailing list