[Ocfs2-users] OCFS2 tuning, fragmentation and localalloc option. Cluster hanging during mix read+write workloads

Gavin Jones gjones at where2getit.com
Tue Jul 16 07:58:00 PDT 2013


Hello,

Block size: 4kB

Kernel version:  3.4.6-2.10-default

OCFS2:  1.5.0

Distribution is openSUSE 12.2.

Thanks,

Gavin W. Jones
Where 2 Get It, Inc.

On Mon, Jul 15, 2013 at 7:32 PM, Srinivas Eeda <srinivas.eeda at oracle.com> wrote:
> I am not entirely sure about significant slowdown and cluster outage.
> But from your description and information you provided, you are seeing
> fragmentation related issues. What is the ocfs2/kernel version and what
> is the cluster size/block size of these volumes?
>
>
> On 07/15/2013 01:33 PM, Gavin Jones wrote:
>> Hello,
>>
>> We have a 16 node OCFS2 cluster used for web serving duties.  Each
>> node mounts (the same) 6 OCFS2 volumes.  Shared data includes client
>> files, application files for our webapp, log files, configuration
>> files.  Storage provided by 2x EqualLogic PS400E iSCSI SANs, each
>> having 12 drives in a RAID50; units are in a 'Group'.
>>
>> The problem we are having is that periodically, maybe once a week or
>> so, we get several Apache processes on a handful of nodes that get
>> stuck in D state and are unable to recover.  This greatly increases
>> server load, causes more Apache processes to backup, OCFS2 starts
>> complaining about unresponsive nodes and before you know it, the
>> cluster is down.
>>
>> This seems to occur most often when we are doing writes + reads; if it
>> is just reads the cluster hums along.  However, when we need to update
>> many files or remove lots of files (think temporary images) in
>> addition to normal read activity, we have the above-mentioned problem.
>>
>> We have done some searching and found
>> http://www.mail-archive.com/ocfs2-users@oss.oracle.com/msg05525.html
>> which describes a similar problem with write activity.  In that case,
>> the problem was allocating contiguous space on a fragmented filesystem
>> and the solution was to adjust the mount option 'localalloc'.  We are
>> wondering if we are in a similar position.
>>
>> Below is the output from the stat_sysdir_analyze.sh script mentioned
>> in the link above, which analyzes stat_sysdir.sh output; I've included
>> the two volumes that seem to be our 'problem' volumes.
>>
>> Volume 1:
>> bash stat_sysdir_analyze.sh sde1-client-20130715.txt
>> Number |
>> of |
>> clust. | Contiguous cluster size
>> --------------------------------
>> 4549 510 and smaller
>> 1825 511
>>
>> Volume 2:
>> bash stat_sysdir_analyze.sh sdd1-data-20130715.txt
>> Number |
>> of |
>> clust. | Contiguous cluster size
>> --------------------------------
>> 175 510 and smaller
>> 23 511
>>
>> Any evidence here of excessive fragmentation that tuning localalloc
>> would help with?
>>
>> Also regarding localalloc, I notice it is different for the above two
>> volumes on many of the nodes; I find this interesting as the cluster
>> is supposed to make an educated guess on this value.  For instance:
>>
>> /dev/sda1 on /u/client type ocfs2
>> (rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=6,coherency=full,user_xattr,noacl)
>> /dev/sde1 on /u/data type ocfs2
>> (rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=5,coherency=full,user_xattr,noacl)
>>
>>
>> /dev/sdd1 on /u/client type ocfs2
>> (rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=9,coherency=full,user_xattr,noacl)
>> /dev/sdb1 on /u/data type ocfs2
>> (rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=5,coherency=full,user_xattr,noacl)
>>
>>
>> /dev/sda1 on /u/client type ocfs2
>> (rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=11,coherency=full,user_xattr,noacl)
>> /dev/sdc1 on /u/data type ocfs2
>> (rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=5,coherency=full,user_xattr,noacl)
>>
>>
>> /dev/sda1 on /u/client type ocfs2
>> (rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=6,coherency=full,user_xattr,noacl)
>> /dev/sdc1 on /u/data type ocfs2
>> (rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=7,coherency=full,user_xattr,noacl
>>
>> I'm not sure why the cluster would be picking different values
>> depending on the node?
>>
>> Anyway, any opinions, advice, tuning suggestions greatly appreciated.
>> This business of the cluster hanging is turning into quite a problem.
>>
>> I'll provide any other requested information upon request.
>>
>> Thanks,
>>
>> Gavin W. Jones
>> Where 2 Get It, Inc.
>>
>> --
>> "There has grown up in the minds of certain groups in this country the
>> notion that because a man or corporation has made a profit out of the
>> public for a number of years, the government and the courts are
>> charged with the duty of guaranteeing such profit in the future, even
>> in the face of changing circumstances and contrary to public interest.
>> This strange doctrine is not supported by statute nor common law."
>>
>> ~Robert Heinlein
>>
>> _______________________________________________
>> Ocfs2-users mailing list
>> Ocfs2-users at oss.oracle.com
>> https://oss.oracle.com/mailman/listinfo/ocfs2-users
>
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-users



-- 
"There has grown up in the minds of certain groups in this country the
notion that because a man or corporation has made a profit out of the
public for a number of years, the government and the courts are
charged with the duty of guaranteeing such profit in the future, even
in the face of changing circumstances and contrary to public interest.
This strange doctrine is not supported by statute nor common law."

~Robert Heinlein



More information about the Ocfs2-users mailing list