[Ocfs2-users] OCFS2 tuning, fragmentation and localalloc option. Cluster hanging during mix read+write workloads

Mon Jul 15 17:32:36 PDT 2013

I am not entirely sure about significant slowdown and cluster outage. 
But from your description and information you provided, you are seeing 
fragmentation related issues. What is the ocfs2/kernel version and what 
is the cluster size/block size of these volumes?

On 07/15/2013 01:33 PM, Gavin Jones wrote:
> Hello,
>
> We have a 16 node OCFS2 cluster used for web serving duties.  Each
> node mounts (the same) 6 OCFS2 volumes.  Shared data includes client
> files, application files for our webapp, log files, configuration
> files.  Storage provided by 2x EqualLogic PS400E iSCSI SANs, each
> having 12 drives in a RAID50; units are in a 'Group'.
>
> The problem we are having is that periodically, maybe once a week or
> so, we get several Apache processes on a handful of nodes that get
> stuck in D state and are unable to recover.  This greatly increases
> server load, causes more Apache processes to backup, OCFS2 starts
> complaining about unresponsive nodes and before you know it, the
> cluster is down.
>
> This seems to occur most often when we are doing writes + reads; if it
> is just reads the cluster hums along.  However, when we need to update
> many files or remove lots of files (think temporary images) in
> addition to normal read activity, we have the above-mentioned problem.
>
> We have done some searching and found
> http://www.mail-archive.com/ocfs2-users@oss.oracle.com/msg05525.html
> which describes a similar problem with write activity.  In that case,
> the problem was allocating contiguous space on a fragmented filesystem
> and the solution was to adjust the mount option 'localalloc'.  We are
> wondering if we are in a similar position.
>
> Below is the output from the stat_sysdir_analyze.sh script mentioned
> in the link above, which analyzes stat_sysdir.sh output; I've included
> the two volumes that seem to be our 'problem' volumes.
>
> Volume 1:
> bash stat_sysdir_analyze.sh sde1-client-20130715.txt
> Number |
> of |
> clust. | Contiguous cluster size
> --------------------------------
> 4549 510 and smaller
> 1825 511
>
> Volume 2:
> bash stat_sysdir_analyze.sh sdd1-data-20130715.txt
> Number |
> of |
> clust. | Contiguous cluster size
> --------------------------------
> 175 510 and smaller
> 23 511
>
> Any evidence here of excessive fragmentation that tuning localalloc
> would help with?
>
> Also regarding localalloc, I notice it is different for the above two
> volumes on many of the nodes; I find this interesting as the cluster
> is supposed to make an educated guess on this value.  For instance:
>
> /dev/sda1 on /u/client type ocfs2
> (rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=6,coherency=full,user_xattr,noacl)
> /dev/sde1 on /u/data type ocfs2
> (rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=5,coherency=full,user_xattr,noacl)
>
>
> /dev/sdd1 on /u/client type ocfs2
> (rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=9,coherency=full,user_xattr,noacl)
> /dev/sdb1 on /u/data type ocfs2
> (rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=5,coherency=full,user_xattr,noacl)
>
>
> /dev/sda1 on /u/client type ocfs2
> (rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=11,coherency=full,user_xattr,noacl)
> /dev/sdc1 on /u/data type ocfs2
> (rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=5,coherency=full,user_xattr,noacl)
>
>
> /dev/sda1 on /u/client type ocfs2
> (rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=6,coherency=full,user_xattr,noacl)
> /dev/sdc1 on /u/data type ocfs2
> (rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=7,coherency=full,user_xattr,noacl
>
> I'm not sure why the cluster would be picking different values
> depending on the node?
>
> Anyway, any opinions, advice, tuning suggestions greatly appreciated.
> This business of the cluster hanging is turning into quite a problem.
>
> I'll provide any other requested information upon request.
>
> Thanks,
>
> Gavin W. Jones
> Where 2 Get It, Inc.
>
> --
> "There has grown up in the minds of certain groups in this country the
> notion that because a man or corporation has made a profit out of the
> public for a number of years, the government and the courts are
> charged with the duty of guaranteeing such profit in the future, even
> in the face of changing circumstances and contrary to public interest.
> This strange doctrine is not supported by statute nor common law."
>
> ~Robert Heinlein
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-users