[Ocfs2-users] umount hang + high CPU
sylarrrrrrr at aim.com
sylarrrrrrr at aim.com
Tue Jul 7 15:15:03 PDT 2009
Aha, ok, I don't see the oops, or anything about the hang in the logs. The hanged machine still reply to pings.
The story now is , that I thought that I can use the :
tunefs.ocfs2 --cloned-volume /dev/mylvmsnapshot
in order to mount the snapshot... (big mistake)...well I did manage to mount the snapshot, but as soon as
I umounted it, the umount process hanged, and then the whole machine hanged, except that it responds to pings.
Now, I have downloaded the ocfs2-1.4-userguide.pdf , and went to section 'f) DLM Debuging', and tried the commands
there on the still working node, but only 'cat /sys/kernel/debug/o2dlm/*/dlm_state' worked and produced the following output:
Domain: 1ACAFCEE7ACA47C089069117560F5C91 Key: 0xb9d649ba
Thread Pid: 5664 Node: 0 State: JOINED
Number of Joins: 1 Joining Node: 255
Domain Map: 0
Live Map: 0
Lock Resources: 51168 (180512)
MLEs: 0 (291689)
Blocking: 0 (139713)
Mastery: 0 (151976)
Migration: 0 (0)
Lists: Dirty=Empty Purge=InUse PendingASTs=Empty PendingBASTs=Empty
Purge Count: 8 Refs: 51169
Dead Node: 255
Recovery Pid: 5665 Master: 255 State: INACTIVE
Recovery Map:
Recovery Node State:
the other commands:
debugfs.ocfs2 –R “fs_locks –B” /dev/drbd0
debugfs.ocfs2 –R “fs_locks –B” /dev/vg/lv
debugfs.ocfs2 –R “dlm_locks M000000000000000022d63c00000000” /dev/drbd0
produced the error:
open: Device name specified was not found while opening context
for device –R
debugfs.ocfs2 1.4.2
debugfs:
and:
ps -e -o pid,stat,comm,wchan=WIDE-WCHAN-COLUMN
procuded no D state process.
I am sorry I write it in the mailing list, but I am a noob, so I don't even know if it is a bug, or a misconfiguration, or a misunderstanding.
PS. Is nodiratime option supported for mounts? I used it, but I don't see it in the user-guide.
-----Original Message-----
From: Sunil Mushran <sunil.mushran at oracle.com>
To: sylarrrrrrr at aim.com
Cc: tao.ma at oracle.com; ocfs2-users at oss.oracle.com
Sent: Tue, Jul 7, 2009 8:46 pm
Subject: Re: [Ocfs2-users] umount hang + high CPU
The fix was for the oops you saw.
The hang is a different issue. We have no info on that.
For that, if you would like to diagnose the problem, read up the dlm notes
in the 1.4 user's guide. It explains a debugging process vis-a-vis hangs.
If the issue is dlm related, then we would like to have the tcpdumps.
Lastly, emails are not an efficient vehicle for handling such issues. Use
the bugzilla as it allows us to collect information in one place.
Sunil
sylarrrrrrr at aim.com wrote:
> So this bug is not over yet :(
>
> I have checked my kernel source and indeed it have this patch but I
> still get the hang.
>
> PS. my linux-2.6-2.6.30/fs/ocfs2/dcache.c kernel source has:
>
> 290 else
>
291 mlog_errno(ret);
> 292
> 293 /*
> 294 * In case of error, manually free the allocation and
> do the iput().
> 295 * We need to do this because error here means no
> d_instantiate(),
> 296 * which means iput() will not be called during
> dput(dentry).
> 297 */
> 298 if (ret < 0 && !alias) {
> 299 ocfs2_lock_res_free(&dl->dl_lockres);
> 300 BUG_ON(dl->dl_count != 1);
> 301 spin_lock(&dentry_attach_lock);
> 302 dentry->d_fsdata = NULL;
> 303 spin_unlock(&dentry_attach_lock);
> 304 kfree(dl);
> 305 iput(inode);
> 306 }
> 307
> 308 dput(alias);
> 309
> 310 return ret;
> 311 }
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20090707/47a55a5e/attachment.html
More information about the Ocfs2-users
mailing list