[Ocfs2-users] How to clean orphan metadata?

Mon Jul 27 09:16:16 PDT 2009

Hi Karim...

Running the commands (in ALL clients) to identify the application/node 
associated with the orphan_dir does not provide me any output.

root at fw01 ~]# for i in 07 08 09 10 11 12 21 22 23 24 25 26; do echo "### 
core$i ###"; ssh core$i "find /proc -name fd -exec ls -l {} \; | grep 
deleted; lsof | grep -i deleted"; done
### core07 ###
### core08 ###
### core09 ###
### core10 ###
### core11 ###
### core12 ###
### core21 ###
### core22 ###
### core23 ###
### core24 ###
### core25 ###
### core26 ###

I've also tried "mount -o remount /site06", and several syncs, in all 
clients, but without success.

The orphan file continues there... :(

Cheers
Goncalo

On 07/27/2009 04:33 PM, Karim Alkhayer wrote:
>
> Hi Goncalo,
>
> Here're some guidelines to rectify your issue:
>
> *_Identify cluster node and application associated with orphan_dir_*
>
> Run the following command(s) on each cluster node to identify which 
> node, application or user (holders) are associated with orphan_dir 
> entries.
>
> |# find /proc -name fd -exec ls -l {} \; | grep deleted|
> | or|
> |# lsof | grep -i deleted|
>
>
> Next, review the output of the above command(s) noting any that relate 
> to the OCFS2 filesystem in question.
> At this point, you should be able to determine the holding process id 
> (pid)
>
> *_Releasing disk space associated with OCFS2 orphan directories_*
>
> The above step allows you to identify the pid associated with orphaned 
> files.
> If the holding process(es) can still be gracefully interacted with via 
> their user interface, and you are certain that the process is safe to 
> stop without adverse effect upon your environment, then shutdown the 
> process(es) in question. Once the process(es) close their open file 
> descriptors, orphaned files will be deleted and the associated disk 
> space made available.
>
> If the process(es) in question cannot be interacted with via their 
> user interface, or if you are certain the processes are no longer 
> required, then kill the associated process(es) i.e. `kill <pid>`. If 
> any process(es) are no longer communicable (i.e. zombie) or cannot be 
> successfully killed, a forced unmount of the OCFS2 volume in question 
> and/or reboot of the associated cluster node may be necessary in order 
> to recover the disk space associated with orphaned files.
>
> Let us know how it goes!
>
> Best regards,
>
> Karim Alkhayer
>
> *From:* ocfs2-users-bounces at oss.oracle.com 
> [mailto:ocfs2-users-bounces at oss.oracle.com] *On Behalf Of *Gonçalo Borges
> *Sent:* Monday, July 27, 2009 4:35 PM
> *To:* ocfs2-users at oss.oracle.com
> *Subject:* [Ocfs2-users] How to clean orphan metadata?
>
> Hi All...
>
> 1) I have recently deleted a big 100GB file from an OCFS2 partition. 
> The problem is that a "df" command still shows that partition with 142 
> GB of used spaced when it should report ~42Gb of used space (look to 
> */site06)*:
>
> [root at core23 ~]# df -h
> Filesystem            Size  Used Avail Use% Mounted on
> /dev/sda1              87G  2.4G   80G   3% /
> tmpfs                 512M     0  512M   0% /dev/shm
> none                  512M  104K  512M   1% /var/lib/xenstored
> /dev/mapper/iscsi04-lun1p1
>                       851G   63G  788G   8% /site04
> /dev/mapper/iscsi05-lun1p1
>                       851G   65G  787G   8% /site05
> /dev/mapper/iscsi06-lun2p1
>                       884G  100G  785G  12% /apoio06
> /dev/mapper/iscsi06-lun1p1
> *851G  142G  709G  17% /site06
>
>
> *2) Running "debugfs.ocfs2 /dev/mapper/iscsi06-lun1p1", I found the 
> following relevant file:
>
> debugfs: ls -l //orphan_dir:0001
>     13              drwxr-xr-x   2     0     0            3896 
> 27-Jul-2009 09:55 .
>     6               drwxr-xr-x  18     0     0            4096  
> 9-Jul-2009 12:24 ..
>     524781          -rw-r--r--   0     0     0    104857600000 
> 24-Jul-2009 16:35 00000000000801ed
>
>
> 3) I need to clean this metadata information, but I can not run 
> "fsck.ocfs2 -f" because this is a production filesystem being accessed 
> by 12 clients. To run "fsck.ocfs2 -f" I would have to unmount the 
> partition from all the clients, and this is not a solution at the 
> time. The software I'm currently using is:
>
> [root at core09 log]# cat /etc/redhat-release
> Scientific Linux SL release 5.3 (Boron)
>
> [root at core09 log]# uname -a
> Linux core09.ncg.ingrid.pt 2.6.18-128.1.16.el5xen #1 SMP Tue Jun 30 
> 07:06:24 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
>
> [root at core09 log]# rpm -qa | grep ocfs2
> ocfs2-2.6.18-128.1.16.el5xen-1.4.2-1.el5
> ocfs2-tools-1.4.2-1.el5
> ocfs2console-1.4.2-1.el5
>
>
> Is there a workaround for this?
> Cheers
> Goncalo
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20090727/07f9c7af/attachment-0001.html