[Ocfs2-users] How to clean orphan metadata?

Tue Jul 28 10:23:12 PDT 2009

Hi All...

Thanks for the replies, but I didn't had any other option than to 
disconnect the xen machines running in the partition, and unmount/mount 
it, for the partition to be able to report the correct size again.

Cheers
Goncalo

On 07/27/2009 05:16 PM, Gonçalo Borges wrote:
> Hi Karim...
>
>
> Running the commands (in ALL clients) to identify the application/node 
> associated with the orphan_dir does not provide me any output.
>
> root at fw01 ~]# for i in 07 08 09 10 11 12 21 22 23 24 25 26; do echo 
> "### core$i ###"; ssh core$i "find /proc -name fd -exec ls -l {} \; | 
> grep deleted; lsof | grep -i deleted"; done
> ### core07 ###
> ### core08 ###
> ### core09 ###
> ### core10 ###
> ### core11 ###
> ### core12 ###
> ### core21 ###
> ### core22 ###
> ### core23 ###
> ### core24 ###
> ### core25 ###
> ### core26 ###
>
> I've also tried "mount -o remount /site06", and several syncs, in all 
> clients, but without success.
>
> The orphan file continues there... :(
>
> Cheers
> Goncalo
>
>
> On 07/27/2009 04:33 PM, Karim Alkhayer wrote:
>>
>> Hi Goncalo,
>>
>> Here're some guidelines to rectify your issue:
>>
>> *_Identify cluster node and application associated with orphan_dir_*
>>
>> Run the following command(s) on each cluster node to identify which 
>> node, application or user (holders) are associated with orphan_dir 
>> entries.
>>
>> |# find /proc -name fd -exec ls -l {} \; | grep deleted|
>> | or|
>> |# lsof | grep -i deleted|
>>
>>
>> Next, review the output of the above command(s) noting any that 
>> relate to the OCFS2 filesystem in question.
>> At this point, you should be able to determine the holding process id 
>> (pid)
>>
>> *_Releasing disk space associated with OCFS2 orphan directories_*
>>
>> The above step allows you to identify the pid associated with 
>> orphaned files.
>> If the holding process(es) can still be gracefully interacted with 
>> via their user interface, and you are certain that the process is 
>> safe to stop without adverse effect upon your environment, then 
>> shutdown the process(es) in question. Once the process(es) close 
>> their open file descriptors, orphaned files will be deleted and the 
>> associated disk space made available.
>>
>> If the process(es) in question cannot be interacted with via their 
>> user interface, or if you are certain the processes are no longer 
>> required, then kill the associated process(es) i.e. `kill <pid>`. If 
>> any process(es) are no longer communicable (i.e. zombie) or cannot be 
>> successfully killed, a forced unmount of the OCFS2 volume in question 
>> and/or reboot of the associated cluster node may be necessary in 
>> order to recover the disk space associated with orphaned files.
>>
>> Let us know how it goes!
>>
>> Best regards,
>>
>> Karim Alkhayer
>>
>> *From:* ocfs2-users-bounces at oss.oracle.com 
>> [mailto:ocfs2-users-bounces at oss.oracle.com] *On Behalf Of *Gonçalo Borges
>> *Sent:* Monday, July 27, 2009 4:35 PM
>> *To:* ocfs2-users at oss.oracle.com
>> *Subject:* [Ocfs2-users] How to clean orphan metadata?
>>
>> Hi All...
>>
>> 1) I have recently deleted a big 100GB file from an OCFS2 partition. 
>> The problem is that a "df" command still shows that partition with 
>> 142 GB of used spaced when it should report ~42Gb of used space (look 
>> to */site06)*:
>>
>> [root at core23 ~]# df -h
>> Filesystem            Size  Used Avail Use% Mounted on
>> /dev/sda1              87G  2.4G   80G   3% /
>> tmpfs                 512M     0  512M   0% /dev/shm
>> none                  512M  104K  512M   1% /var/lib/xenstored
>> /dev/mapper/iscsi04-lun1p1
>>                       851G   63G  788G   8% /site04
>> /dev/mapper/iscsi05-lun1p1
>>                       851G   65G  787G   8% /site05
>> /dev/mapper/iscsi06-lun2p1
>>                       884G  100G  785G  12% /apoio06
>> /dev/mapper/iscsi06-lun1p1
>> *851G  142G  709G  17% /site06
>>
>>
>> *2) Running "debugfs.ocfs2 /dev/mapper/iscsi06-lun1p1", I found the 
>> following relevant file:
>>
>> debugfs: ls -l //orphan_dir:0001
>>     13              drwxr-xr-x   2     0     0            3896 
>> 27-Jul-2009 09:55 .
>>     6               drwxr-xr-x  18     0     0            4096  
>> 9-Jul-2009 12:24 ..
>>     524781          -rw-r--r--   0     0     0    104857600000 
>> 24-Jul-2009 16:35 00000000000801ed
>>
>>
>> 3) I need to clean this metadata information, but I can not run 
>> "fsck.ocfs2 -f" because this is a production filesystem being 
>> accessed by 12 clients. To run "fsck.ocfs2 -f" I would have to 
>> unmount the partition from all the clients, and this is not a 
>> solution at the time. The software I'm currently using is:
>>
>> [root at core09 log]# cat /etc/redhat-release
>> Scientific Linux SL release 5.3 (Boron)
>>
>> [root at core09 log]# uname -a
>> Linux core09.ncg.ingrid.pt 2.6.18-128.1.16.el5xen #1 SMP Tue Jun 30 
>> 07:06:24 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
>>
>> [root at core09 log]# rpm -qa | grep ocfs2
>> ocfs2-2.6.18-128.1.16.el5xen-1.4.2-1.el5
>> ocfs2-tools-1.4.2-1.el5
>> ocfs2console-1.4.2-1.el5
>>
>>
>> Is there a workaround for this?
>> Cheers
>> Goncalo
>>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20090728/ecef1849/attachment.html