[Ocfs2-tools-commits] branch, daemon-fixups, created. ocfs2-tools-1.4.0-136-g71bdcce
svn-commits at oss.oracle.com
svn-commits at oss.oracle.com
Sat Aug 16 02:55:09 PDT 2008
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "Tools to manage the ocfs2 filesystem.".
The branch, daemon-fixups has been created
at 71bdccedf747073b0e060a7ee1afcf09bae28582 (commit)
- Log -----------------------------------------------------------------
commit 71bdccedf747073b0e060a7ee1afcf09bae28582
Author: Joel Becker <joel.becker at oracle.com>
Date: Sat Aug 16 02:35:13 2008 -0700
ocfs2_controld: Fix double-leave in complete_mount().
In commit f5032771bc41bf9ff31ed42f332d8ec8def39e55 (ocfs2_controld: Allow
multiple real mounts.) we stopped tracking mountpoints. Instead, we
describe different applications as "services". "tunefs.ocfs2" is one
service. "fsck.ocfs2" is another. The actual filesystem uses the
service "ocfs2".
Only one instance of a service is allowed, except for the filesystem.
You can, of course, have one device mounted at multiple mountpoints:
# mount /dev/sdb1 /ocfs2
# mount /dev/sdb1 /ocfs3
In the special case of the "ocfs2" service, ocfs2_controld will allow
more than one MOUNT call (send by o2cb_begin_group_join).
The additinaly mounters get EALREADY, which libo2cb knows to interpret
as success. It goes like this:
mount.ocfs2 on /ocfs2 (first) ocfs2_controld
----------------------------- --------------
o2cb_begin_group_join(uuid)
start_join(uuid)
finish_join(uuid)
dlmcontrol_register(uuid)
notify_mount_client(0)
err = mount(dev, mntpnt)
o2cb_complete_group_join(uuid, err)
complete_mount(uuid, err)
mount.ocfs2 on /ocfs3 (additional) ocfs2_controld
---------------------------------- --------------
o2cb_begin_group_join(uuid)
notify_client(EALREADY)
err = mount(dev, mntpnt)
o2cb_complete_group_join(uuid, err)
complete_mount(uuid, err)
Here's the crux of the problem. If that first mounter gets an error
from mount(2), the daemon's complete_mount() should leave the group.
There's no filesystem mounted.
However, if the *second* mounter gets an error from mount(2) (say, a
missing mountpoint), the daemon should not leave the group - that first
mount is still going! That's the bug. The daemon didn't know the
difference, and it would leave the group. The first mount was left out
in the lurch.
The fix is to mark the additional mounts as such. complete_mount()
notices the additional flag and does nothing beyond responding to
mount.ocfs2(8).
dead_mounter() had the same problem. If an additional mounter died, it
was treated like a first mounter. dead_mounter() now does what
complete_mount() does. It cleans up the additional state and nothing
else.
While we're there, we've learned enough about our state to handle first
mounts that died before sending their status to the daemon. We've
always known that a dead_mounter() during group join could just set
leave_on_join. But if the mount program has already been notified,
there may be a mounted filesystem. We pinned the filesystem as busy and
basically locked out all other operations.
But as it turns out, a fully operational group is a good state. We can
clear the in-progress flag and allow new operations. Additional mounts
can happen cleanly, and umounts as well. If the filesystem never got
mounted, a cleanup with ocfs2_hb_ctl is safe. It's up to the
administrator to check safety, but it's a predictable environment.
Signed-off-by: Joel Becker <joel.becker at oracle.com>
-----------------------------------------------------------------------
hooks/post-receive
--
Tools to manage the ocfs2 filesystem.
More information about the Ocfs2-tools-commits
mailing list