[Ocfs2-users] Slow umounts on SLES10 patchlevel 3 ocfs2

Sunil Mushran sunil.mushran at oracle.com
Thu Jul 14 13:30:24 PDT 2011


Well, half a million on its own does not account for the time. But if one were
to add heavily loaded servers, slower interconnect, high% of shared resources,
the numbers could add up.

I mean, this is a fairly old release. We have made improvements since then.
Having said that, the biggest improvement, parallel migration, is still in our
todo list.

I guess, for now, this is it.

Sunil

On 07/14/2011 05:33 AM, Marc Grimme wrote:
> So I now have two figures from two different clusters. Both are quite slow during restarts. Having two filesystems mounted.
>
> Cluster1 (that one that last time took very long):
> Clusterlocks held by filesystem..
> 1788AD39151A4E76997420D62A778E65: 274258 locks
> 1EFA64C36FD54AB48B734A99E7F45A73: 576842 locks
> Clusterresources held by filesystem..
> 1788AD39151A4E76997420D62A778E65: 214545 resources
> 1EFA64C36FD54AB48B734A99E7F45A73: 469319 resources
>
> Second cluster (also takes quite long):
> Clusterlocks held by filesystem..
> 1EDBCFF0CAB24D0CAE91CB2DA241E8CA: 717186 locks
> 585462C2FA5A428D913A3CBDBC77E116: 68 locks
> Clusterresources held by filesystem..
> 1EDBCFF0CAB24D0CAE91CB2DA241E8CA: 587471 resources
> 585462C2FA5A428D913A3CBDBC77E116: 20 resources
>
>
> Let me know if you need more information.
>
> Thanks
> Marc.
> ----- "Sunil Mushran"<sunil.mushran at oracle.com>  wrote:
>
>> It was designed to run in prod envs.
>>
>> On 07/07/2011 12:21 AM, Marc Grimme wrote:
>>> Sunil,
>>> can I query those figures during runtime of a productive cluster?
>>> Or might it influence the availability performance what ever?
>>>
>>> Thanks for your help.
>>> Marc.
>>> ----- "Sunil Mushran"<sunil.mushran at oracle.com>   wrote:
>>>
>>>> umount is a two step process. First the fs frees the inodes. Then
>> the
>>>> o2dlm takes stock of all active resources and migrates ones that
>> are
>>>> still in use. This typically takes some time. But I have never
>> heard
>>>> of it taking 45 mins.
>>>>
>>>> But I guess it could be if one has a lot of resources. Lets start
>> by
>>>> getting a count.
>>>>
>>>> This will dump the number of cluster locks held by the fs.
>>>> # for vol in /sys/kernel/debug/ocfs2/*
>>>>        do
>>>>            count=$(wc -l ${vol}/locking_state | cut -f1 -d' ');
>>>>            echo "$(basename ${vol}): ${count} locks" ;
>>>>        done;
>>>>
>>>> This will dump the number of lock resources known to the dlm.
>>>> # for vol in /sys/kernel/debug/o2dlm/*
>>>>        do
>>>>            count=$(grep -c "^NAME:" ${vol}/locking_state);
>>>>            echo "$(basename ${vol}): ${count} resources" ;
>>>>        done;
>>>>
>>>> The debugfs needs to be mounted for this to work.
>>>> mount -t debugfs none /sys/kernel/debug
>>>>
>>>> Sunil
>>>>
>>>> On 07/06/2011 08:20 AM, Marc Grimme wrote:
>>>>> Hi,
>>>>> we are using a SLES10 Patchlevel 3 with 12 Nodes hosting tomcat
>>>> application servers.
>>>>> The cluster was running some time (about 200 days) without
>>>> problems.
>>>>> Recently we needed to shutdown the cluster for maintenance and
>>>> experianced very long times for the umount of the filesystem. It
>> took
>>>> something like 45 minutes each node and filesystem (12 x 45
>> minutes
>>>> shutdown time).
>>>>> As a result the planned downtime had to be extended ;-) .
>>>>>
>>>>> Is there any tuning option or the like to make those umounts
>> faster
>>>> or is this something we have to live with?
>>>>> Thanks for your help.
>>>>> If you need more information let me know.
>>>>>
>>>>> Marc.
>>>>>
>>>>> Some info on the configuration:
>>>>> ---------------------------X8-----------------------------------
>>>>> # /sbin/modinfo ocfs2
>>>>> filename:
>>>> /lib/modules/2.6.16.60-0.54.5-smp/kernel/fs/ocfs2/ocfs2.ko
>>>>> license:        GPL
>>>>> author:         Oracle
>>>>> version:        1.4.1-1-SLES
>>>>> description:    OCFS2 1.4.1-1-SLES Wed Jul 23 18:33:42 UTC 2008
>>>> (build f922955d99ef972235bd0c1fc236c5ddbb368611)
>>>>> srcversion:     986DD1EE4F5ABD8A44FF925
>>>>> depends:        ocfs2_dlm,jbd,ocfs2_nodemanager
>>>>> supported:      yes
>>>>> vermagic:       2.6.16.60-0.54.5-smp SMP gcc-4.1
>>>>> atix at CAS12:~>    /sbin/modinfo ocfs2_dlm
>>>>> filename:
>>>> /lib/modules/2.6.16.60-0.54.5-smp/kernel/fs/ocfs2/dlm/ocfs2_dlm.ko
>>>>> license:        GPL
>>>>> author:         Oracle
>>>>> version:        1.4.1-1-SLES
>>>>> description:    OCFS2 DLM 1.4.1-1-SLES Wed Jul 23 18:33:42 UTC
>> 2008
>>>> (build f922955d99ef972235bd0c1fc236c5ddbb368611)
>>>>> srcversion:     16FE87920EA41CA613E6609
>>>>> depends:        ocfs2_nodemanager
>>>>> supported:      yes
>>>>> vermagic:       2.6.16.60-0.54.5-smp SMP gcc-4.1
>>>>> parm:           dlm_purge_interval_ms:int
>>>>> parm:           dlm_purge_locks_max:int
>>>>> # rpm -qa ocfs2*
>>>>> ocfs2-tools-1.4.0-0.9.9
>>>>> ocfs2console-1.4.0-0.9.9
>>>>> ---------------------------X8-----------------------------------
>>>>> The kernel version is 2.6.16.60-0.54.5-smp
>>>>>
>>>>>
>> ______________________________________________________________________________
>>>>> Marc Grimme
>>>>>
>>>>> E-Mail: grimme at atix.de




More information about the Ocfs2-users mailing list