[Ocfs2-users] ocfs2 file system just became very slow and unresponsive for writes

Tariq Saeed tariq.x.saeed at oracle.com
Sat Sep 19 18:03:12 PDT 2015


Hi,
First suspect if fragmented fs. Please run
the attached script and send the ouput.
Thanks.
-Tariq
On 09/19/2015 03:43 PM, Alan Hodgson wrote:
> I've had this filesystem in production for 8 months or so. It's on an array of
> Intel S3500 SSDs on an LSI hardware raid controller (without trim).
>
> This filesystem has pretty consistently delivered >500MB/sec writes, up to 300
> from any particular guest, and has otherwise been responsive.
>
> Then, within the last couple of days, it is now writing at like 25-50 MB/sec
> on average, and seems to block reads for long enough to cause guest issues.
>
> It is a 2-node cluster, the file system is on top of a DRBD active/active
> cluster. The node interconnection is a dedicated 10 Gbit link.
>
> The SSD array doesn't seem to be the issue. I have local file systems on the
> same array, and they write at close to 1GB/sec. Not quite as fast as new, but
> still decent.
>
> DRBD still seems to be fast. Resync appears to be happening at over 400
> MB/sec, although not tested extensively as I don't want to resync the whole
> partition. And the issue remains regardless of whether the second node is even
> up.
>
> Writes to ocfs2 with either one or both nodes mounted ... 25-50 MB/sec. And
> super slow/blocked reads within the guests while it's doing them. The cluster
> is really quite screwed as a result. A straight dd to a file on the host
> averages 25MB/sec. Reads are fine, though, well over 1GB/sec.
>
> The file system is a little less than half full. It hosts only KVM guest images
> (raw sparse files).
>
> I have added maybe 300GB of data in the last 24 hours, but I do believe this
> started happening before that.
>
> Random details below, happy to supply anything ... thanks in advance for any
> help.
>
> df:
> /dev/drbd0       4216522032 1887421612 2329100420  45% /vmhost
>
> mount:
> configfs on /sys/kernel/config type configfs (rw,relatime)
> none on /sys/kernel/dlm type ocfs2_dlmfs (rw,relatime)
> /dev/drbd0 on /vmhost type ocfs2
> (rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-
> ro,atime_quantum=60,localalloc=53,coherency=full,user_xattr,acl,_netdev)
>
> Kernel 3.18.9, hardened Gentoo.
>
> debugfs.ocfs2 -R "stats" /dev/drbd0:
>
>          Revision: 0.90
>          Mount Count: 0   Max Mount Count: 20
>          State: 0   Errors: 0
>          Check Interval: 0   Last Check: Sat Sep 19 14:02:48 2015
>          Creator OS: 0
>          Feature Compat: 3 backup-super strict-journal-super
>          Feature Incompat: 14160 sparse extended-slotmap inline-data xattr
> indexed-dirs refcount discontig-bg
>          Tunefs Incomplete: 0
>          Feature RO compat: 1 unwritten
>          Root Blknum: 5   System Dir Blknum: 6
>          First Cluster Group Blknum: 3
>          Block Size Bits: 12   Cluster Size Bits: 12
>          Max Node Slots: 8
>          Extended Attributes Inline Size: 256
>          Label: vmh1cluster
>          UUID: CF2BAA51E994478587983E08B160930E
>          Hash: 436666593 (0x1a0700e1)
>          DX Seeds: 3101242030 1341766635 3133423927 (0xb8d932ae 0x4ff9bbeb
> 0xbac44137)
>          Cluster stack: classic o2cb
>          Cluster flags: 0
>          Inode: 2   Mode: 00   Generation: 3336532616 (0xc6df7288)
>          FS Generation: 3336532616 (0xc6df7288)
>          CRC32: 00000000   ECC: 0000
>          Type: Unknown   Attr: 0x0   Flags: Valid System Superblock
>          Dynamic Features: (0x0)
>          User: 0 (root)   Group: 0 (root)   Size: 0
>          Links: 0   Clusters: 1054130508
>          ctime: 0x54b593da 0x0 -- Tue Jan 13 13:53:30.0 2015
>          atime: 0x0 0x0 -- Wed Dec 31 16:00:00.0 1969
>          mtime: 0x54b593da 0x0 -- Tue Jan 13 13:53:30.0 2015
>          dtime: 0x0 -- Wed Dec 31 16:00:00 1969
>          Refcount Block: 0
>          Last Extblk: 0   Orphan Slot: 0
>          Sub Alloc Slot: Global   Sub Alloc Bit: 6553
>
>   o2info --volinfo /dev/drbd0 :
>         Label: vmh1cluster
>          UUID: CF2BAA51E994478587983E08B160930E
>    Block Size: 4096
> Cluster Size: 4096
>    Node Slots: 8
>      Features: backup-super strict-journal-super sparse extended-slotmap
>      Features: inline-data xattr indexed-dirs refcount discontig-bg unwritten
>
>
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-users
>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: stat_sysdir.sh
Type: application/x-shellscript
Size: 1508 bytes
Desc: not available
Url : http://oss.oracle.com/pipermail/ocfs2-users/attachments/20150919/5c4c3b99/attachment.bin 


More information about the Ocfs2-users mailing list