[Ocfs2-users] umount hang + high CPU

sylarrrrrrr at aim.com sylarrrrrrr at aim.com
Tue Jul 7 15:15:03 PDT 2009


 Aha, ok, I don't see the oops, or anything about the hang in the logs. The hanged machine still reply to pings.

The story now is , that I thought that I can use the :



 tunefs.ocfs2  --cloned-volume /dev/mylvmsnapshot

in order to mount the snapshot... (big mistake)...well I did manage to mount the snapshot, but as soon as 
I umounted it, the umount process hanged, and then the whole machine hanged, except that it responds to pings.


Now, I have downloaded the ocfs2-1.4-userguide.pdf , and went to section 'f) DLM Debuging', and tried the commands 
there on the still working node, but only 'cat /sys/kernel/debug/o2dlm/*/dlm_state' worked and produced the following output:

Domain: 1ACAFCEE7ACA47C089069117560F5C91  Key: 0xb9d649ba
Thread Pid: 5664  Node: 0  State: JOINED
Number of Joins: 1  Joining Node: 255
Domain Map: 0 
Live Map: 0 
Lock Resources: 51168 (180512)
MLEs: 0 (291689)
  Blocking: 0 (139713)
  Mastery: 0 (151976)
  Migration: 0 (0)
Lists: Dirty=Empty  Purge=InUse  PendingASTs=Empty  PendingBASTs=Empty
Purge Count: 8  Refs: 51169
Dead Node: 255
Recovery Pid: 5665  Master: 255  State: INACTIVE
Recovery Map: 
Recovery Node State:

the other commands:
debugfs.ocfs2 –R “fs_locks –B” /dev/drbd0
debugfs.ocfs2 –R “fs_locks –B” /dev/vg/lv
debugfs.ocfs2 –R “dlm_locks M000000000000000022d63c00000000” /dev/drbd0

produced the error:
open: Device name specified was not found while opening context
 for device –R
debugfs.ocfs2 1.4.2
debugfs: 

and:

ps -e -o pid,stat,comm,wchan=WIDE-WCHAN-COLUMN

procuded no D state process.


I am sorry I write it in the mailing list, but I am a noob, so I don't even know if it is a bug, or a misconfiguration, or a misunderstanding.

PS. Is nodiratime option supported for mounts? I used it, but I don't see it in the user-guide.


 

-----Original Message-----
From: Sunil Mushran <sunil.mushran at oracle.com>
To: sylarrrrrrr at aim.com
Cc: tao.ma at oracle.com; ocfs2-users at oss.oracle.com
Sent: Tue, Jul 7, 2009 8:46 pm
Subject: Re: [Ocfs2-users] umount hang + high CPU









The fix was for the oops you saw. 
 

The hang is a different issue. We have no info on that. 
 

For that, if you would like to diagnose the problem, read up the dlm notes 

in the 1.4 user's guide. It explains a debugging process vis-a-vis hangs. 
 

If the issue is dlm related, then we would like to have the tcpdumps. 
 

Lastly, emails are not an efficient vehicle for handling such issues. Use 

the bugzilla as it allows us to collect information in one place. 
 

Sunil 
 

sylarrrrrrr at aim.com wrote: 

> So this bug is not over yet :( 

> 

> I have checked my kernel source and indeed it have this patch but I 
> still get the hang. 

> 

> PS. my linux-2.6-2.6.30/fs/ocfs2/dcache.c kernel source has: 

> 

>     290         else 

>    
 291                 mlog_errno(ret); 

>     292 

>     293         /* 

>     294          * In case of error, manually free the allocation and 
> do the iput(). 

>     295          * We need to do this because error here means no 
> d_instantiate(), 

>     296          * which means iput() will not be called during 
> dput(dentry). 

>     297          */ 

>     298         if (ret < 0 && !alias) { 

>     299                 ocfs2_lock_res_free(&dl->dl_lockres); 

>     300                 BUG_ON(dl->dl_count != 1); 

>     301                 spin_lock(&dentry_attach_lock); 

>     302                 dentry->d_fsdata = NULL; 

>     303                 spin_unlock(&dentry_attach_lock); 

>     304                 kfree(dl); 

>     305                 iput(inode); 

>     306         } 

>     307 

>     308         dput(alias); 

>     309 

>     310         return ret; 

>     311 } 

> 

> 
 



 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20090707/47a55a5e/attachment.html 


More information about the Ocfs2-users mailing list